IVY-FAKE: A Unified Explainable Framework and Benchmark for Image and Video AIGC Detection
By: Wayne Zhang , Changjiang Jiang , Zhonghao Zhang and more
Potential Business Impact:
Finds fake pictures and videos, explaining why.
The rapid advancement of Artificial Intelligence Generated Content (AIGC) in visual domains has resulted in highly realistic synthetic images and videos, driven by sophisticated generative frameworks such as diffusion-based architectures. While these breakthroughs open substantial opportunities, they simultaneously raise critical concerns about content authenticity and integrity. Many current AIGC detection methods operate as black-box binary classifiers, which offer limited interpretability, and no approach supports detecting both images and videos in a unified framework. This dual limitation compromises model transparency, reduces trustworthiness, and hinders practical deployment. To address these challenges, we introduce IVY-FAKE , a novel, unified, and large-scale dataset specifically designed for explainable multimodal AIGC detection. Unlike prior benchmarks, which suffer from fragmented modality coverage and sparse annotations, IVY-FAKE contains over 150,000 richly annotated training samples (images and videos) and 18,700 evaluation examples, each accompanied by detailed natural-language reasoning beyond simple binary labels. Building on this, we propose Ivy Explainable Detector (IVY-XDETECTOR), a unified AIGC detection and explainable architecture that jointly performs explainable detection for both image and video content. Our unified vision-language model achieves state-of-the-art performance across multiple image and video detection benchmarks, highlighting the significant advancements enabled by our dataset and modeling framework. Our data is publicly available at https://huggingface.co/datasets/AI-Safeguard/Ivy-Fake.
Similar Papers
Spot the Fake: Large Multimodal Model-Based Synthetic Image Detection with Artifact Explanation
CV and Pattern Recognition
Finds fake pictures and explains why.
Advance Fake Video Detection via Vision Transformers
CV and Pattern Recognition
Finds fake videos made by computers.
Chameleon: On the Scene Diversity and Domain Variety of AI-Generated Videos Detection
CV and Pattern Recognition
Finds fake videos made by computers.