Score: 0

A Multimodal Multi-Agent Framework for Radiology Report Generation

Published: May 14, 2025 | arXiv ID: 2505.09787v1

By: Ziruo Yi, Ting Xiao, Mark V. Albert

Potential Business Impact:

Helps doctors write faster, more accurate patient reports.

Business Areas:
Augmented Reality Hardware, Software

Radiology report generation (RRG) aims to automatically produce diagnostic reports from medical images, with the potential to enhance clinical workflows and reduce radiologists' workload. While recent approaches leveraging multimodal large language models (MLLMs) and retrieval-augmented generation (RAG) have achieved strong results, they continue to face challenges such as factual inconsistency, hallucination, and cross-modal misalignment. We propose a multimodal multi-agent framework for RRG that aligns with the stepwise clinical reasoning workflow, where task-specific agents handle retrieval, draft generation, visual analysis, refinement, and synthesis. Experimental results demonstrate that our approach outperforms a strong baseline in both automatic metrics and LLM-based evaluations, producing more accurate, structured, and interpretable reports. This work highlights the potential of clinically aligned multi-agent frameworks to support explainable and trustworthy clinical AI applications.

Country of Origin
πŸ‡ΊπŸ‡Έ United States

Page Count
9 pages

Category
Computer Science:
Artificial Intelligence