Teaching AI Stepwise Diagnostic Reasoning with Report-Guided Chain-of-Thought Learning
By: Yihong Luo , Wenwu He , Zhuo-Xu Cui and more
Potential Business Impact:
Helps computers diagnose diseases from X-rays.
This study presents DiagCoT, a multi-stage framework that applies supervised fine-tuning to general-purpose vision-language models (VLMs) to emulate radiologists' stepwise diagnostic reasoning using only free-text reports. DiagCoT combines contrastive image-report tuning for domain alignment, chain-of-thought supervision to capture inferential logic, and reinforcement tuning with clinical reward signals to enhance factual accuracy and fluency. On the MIMIC-CXR benchmark, DiagCoT improved zero-shot disease classification AUC from 0.52 to 0.76 (absolute gain of 0.24), pathology grounding mIoU from 0.08 to 0.31 (absolute gain of 0.23), and report generation BLEU from 0.11 to 0.33 (absolute gain of 0.22). It outperformed state-of-the-art models including LLaVA-Med and CXR-LLAVA on long-tailed diseases and external datasets. By converting unstructured clinical narratives into structured supervision, DiagCoT offers a scalable approach for developing interpretable and diagnostically competent AI systems for radiology.
Similar Papers
X-Ray-CoT: Interpretable Chest X-ray Diagnosis with Vision-Language Models via Chain-of-Thought Reasoning
CV and Pattern Recognition
Helps doctors understand X-rays by explaining their thoughts.
Pathology-CoT: Learning Visual Chain-of-Thought Agent from Expert Whole Slide Image Diagnosis Behavior
CV and Pattern Recognition
Helps doctors find sickness in slides faster.
Non-Iterative Symbolic-Aided Chain-of-Thought for Logical Reasoning
Artificial Intelligence
Helps computers think through problems better.