Towards Explainable Fake Image Detection with Multi-Modal Large Language Models
By: Yikun Ji , Yan Hong , Jiahui Zhan and more
Potential Business Impact:
Finds fake pictures by explaining how.
Progress in image generation raises significant public security concerns. We argue that fake image detection should not operate as a "black box". Instead, an ideal approach must ensure both strong generalization and transparency. Recent progress in Multi-modal Large Language Models (MLLMs) offers new opportunities for reasoning-based AI-generated image detection. In this work, we evaluate the capabilities of MLLMs in comparison to traditional detection methods and human evaluators, highlighting their strengths and limitations. Furthermore, we design six distinct prompts and propose a framework that integrates these prompts to develop a more robust, explainable, and reasoning-driven detection system. The code is available at https://github.com/Gennadiyev/mllm-defake.
Similar Papers
ThinkFake: Reasoning in Multimodal Large Language Models for AI-Generated Image Detection
CV and Pattern Recognition
Finds fake pictures made by computers.
Interpretable and Reliable Detection of AI-Generated Images via Grounded Reasoning in MLLMs
CV and Pattern Recognition
Finds fake pictures and shows why.
Can Multi-modal (reasoning) LLMs work as deepfake detectors?
CV and Pattern Recognition
Finds fake pictures using smart computer brains.