Score: 1

Teaching AI Stepwise Diagnostic Reasoning with Report-Guided Chain-of-Thought Learning

Published: September 8, 2025 | arXiv ID: 2509.06409v1

By: Yihong Luo , Wenwu He , Zhuo-Xu Cui and more

Potential Business Impact:

Helps computers diagnose diseases from X-rays.

Business Areas:
Image Recognition Data and Analytics, Software

This study presents DiagCoT, a multi-stage framework that applies supervised fine-tuning to general-purpose vision-language models (VLMs) to emulate radiologists' stepwise diagnostic reasoning using only free-text reports. DiagCoT combines contrastive image-report tuning for domain alignment, chain-of-thought supervision to capture inferential logic, and reinforcement tuning with clinical reward signals to enhance factual accuracy and fluency. On the MIMIC-CXR benchmark, DiagCoT improved zero-shot disease classification AUC from 0.52 to 0.76 (absolute gain of 0.24), pathology grounding mIoU from 0.08 to 0.31 (absolute gain of 0.23), and report generation BLEU from 0.11 to 0.33 (absolute gain of 0.22). It outperformed state-of-the-art models including LLaVA-Med and CXR-LLAVA on long-tailed diseases and external datasets. By converting unstructured clinical narratives into structured supervision, DiagCoT offers a scalable approach for developing interpretable and diagnostically competent AI systems for radiology.

Page Count
32 pages

Category
Computer Science:
Artificial Intelligence