Code-in-the-Loop Forensics: Agentic Tool Use for Image Forgery Detection
By: Fanrui Zhang , Qiang Zhang , Sizhuo Zhou and more
Existing image forgery detection (IFD) methods either exploit low-level, semantics-agnostic artifacts or rely on multimodal large language models (MLLMs) with high-level semantic knowledge. Although naturally complementary, these two information streams are highly heterogeneous in both paradigm and reasoning, making it difficult for existing methods to unify them or effectively model their cross-level interactions. To address this gap, we propose ForenAgent, a multi-round interactive IFD framework that enables MLLMs to autonomously generate, execute, and iteratively refine Python-based low-level tools around the detection objective, thereby achieving more flexible and interpretable forgery analysis. ForenAgent follows a two-stage training pipeline combining Cold Start and Reinforcement Fine-Tuning to enhance its tool interaction capability and reasoning adaptability progressively. Inspired by human reasoning, we design a dynamic reasoning loop comprising global perception, local focusing, iterative probing, and holistic adjudication, and instantiate it as both a data-sampling strategy and a task-aligned process reward. For systematic training and evaluation, we construct FABench, a heterogeneous, high-quality agent-forensics dataset comprising 100k images and approximately 200k agent-interaction question-answer pairs. Experiments show that ForenAgent exhibits emergent tool-use competence and reflective reasoning on challenging IFD tasks when assisted by low-level tools, charting a promising route toward general-purpose IFD. The code will be released after the review process is completed.
Similar Papers
From Evidence to Verdict: An Agent-Based Forensic Framework for AI-Generated Image Detection
CV and Pattern Recognition
Finds fake pictures by acting like a detective.
Unlocking the Forgery Detection Potential of Vanilla MLLMs: A Novel Training-Free Pipeline
CV and Pattern Recognition
Finds fake pictures without extra training.
Unlocking the Forgery Detection Potential of Vanilla MLLMs: A Novel Training-Free Pipeline
CV and Pattern Recognition
Find fake pictures without extra training.