Explainable Detection of AI-Generated Images with Artifact Localization Using Faster-Than-Lies and Vision-Language Models for Edge Devices
By: Aryan Mathur , Asaduddin Ahmed , Pushti Amit Vasoya and more
Potential Business Impact:
Finds fake pictures and shows why.
The increasing realism of AI-generated imagery poses challenges for verifying visual authenticity. We present an explainable image authenticity detection system that combines a lightweight convolutional classifier ("Faster-Than-Lies") with a Vision-Language Model (Qwen2-VL-7B) to classify, localize, and explain artifacts in 32x32 images. Our model achieves 96.5% accuracy on the extended CiFAKE dataset augmented with adversarial perturbations and maintains an inference time of 175ms on 8-core CPUs, enabling deployment on local or edge devices. Using autoencoder-based reconstruction error maps, we generate artifact localization heatmaps, which enhance interpretability for both humans and the VLM. We further categorize 70 visual artifact types into eight semantic groups and demonstrate explainable text generation for each detected anomaly. This work highlights the feasibility of combining visual and linguistic reasoning for interpretable authenticity detection in low-resolution imagery and outlines potential cross-domain applications in forensics, industrial inspection, and social media moderation.
Similar Papers
Identity-Aware Vision-Language Model for Explainable Face Forgery Detection
Multimedia
Finds fake pictures by checking if they make sense.
Spot the Fake: Large Multimodal Model-Based Synthetic Image Detection with Artifact Explanation
CV and Pattern Recognition
Finds fake pictures and explains why.
INSIGHT: An Interpretable Neural Vision-Language Framework for Reasoning of Generative Artifacts
CV and Pattern Recognition
Finds fake pictures, even tiny ones, and explains why.