Score: 0

Reasoning-Driven Amodal Completion: Collaborative Agents and Perceptual Evaluation

Published: December 24, 2025 | arXiv ID: 2512.20936v1

By: Hongxing Fan , Shuyu Zhao , Jiayang Ao and more

Amodal completion, the task of inferring invisible object parts, faces significant challenges in maintaining semantic consistency and structural integrity. Prior progressive approaches are inherently limited by inference instability and error accumulation. To tackle these limitations, we present a Collaborative Multi-Agent Reasoning Framework that explicitly decouples Semantic Planning from Visual Synthesis. By employing specialized agents for upfront reasoning, our method generates a structured, explicit plan before pixel generation, enabling visually and semantically coherent single-pass synthesis. We integrate this framework with two critical mechanisms: (1) a self-correcting Verification Agent that employs Chain-of-Thought reasoning to rectify visible region segmentation and identify residual occluders strictly within the Semantic Planning phase, and (2) a Diverse Hypothesis Generator that addresses the ambiguity of invisible regions by offering diverse, plausible semantic interpretations, surpassing the limited pixel-level variations of standard random seed sampling. Furthermore, addressing the limitations of traditional metrics in assessing inferred invisible content, we introduce the MAC-Score (MLLM Amodal Completion Score), a novel human-aligned evaluation metric. Validated against human judgment and ground truth, these metrics establish a robust standard for assessing structural completeness and semantic consistency with visible context. Extensive experiments demonstrate that our framework significantly outperforms state-of-the-art methods across multiple datasets. Our project is available at: https://fanhongxing.github.io/remac-page.

Analyze-Prompt-Reason: A Collaborative Agent-Based Framework for Multi-Image Vision-Language Reasoning

CV and Pattern Recognition

Enables AI to reason over multiple images

1 Aug 2025 0

88%

Multi-Agent Visual-Language Reasoning for Comprehensive Highway Scene Understanding

CV and Pattern Recognition

Helps cameras see road dangers and warn drivers.

24 Aug 2025 0

88%

Diagnosing Visual Reasoning: Challenges, Insights, and a Path Forward

CV and Pattern Recognition

Fixes AI seeing things that aren't there.

23 Oct 2025 1

View PDF Login to Bookmark

Reasoning-Driven Amodal Completion: Collaborative Agents and Perceptual Evaluation

Technical Abstract

Analyze-Prompt-Reason: A Collaborative Agent-Based Framework for Multi-Image Vision-Language Reasoning

Multi-Agent Visual-Language Reasoning for Comprehensive Highway Scene Understanding

Diagnosing Visual Reasoning: Challenges, Insights, and a Path Forward