ClinicalGPT-R1: Pushing reasoning capability of generalist disease diagnosis with large language model
By: Wuyang Lan , Wenzheng Wang , Changwei Ji and more
Potential Business Impact:
Helps doctors diagnose illnesses better using AI.
Recent advances in reasoning with large language models (LLMs)has shown remarkable reasoning capabilities in domains such as mathematics and coding, yet their application to clinical diagnosis remains underexplored. Here, we introduce ClinicalGPT-R1, a reasoning enhanced generalist large language model for disease diagnosis. Trained on a dataset of 20,000 real-world clinical records, ClinicalGPT-R1 leverages diverse training strategies to enhance diagnostic reasoning. To benchmark performance, we curated MedBench-Hard, a challenging dataset spanning seven major medical specialties and representative diseases. Experimental results demonstrate that ClinicalGPT-R1 outperforms GPT-4o in Chinese diagnostic tasks and achieves comparable performance to GPT-4 in English settings. This comparative study effectively validates the superior performance of ClinicalGPT-R1 in disease diagnosis tasks. Resources are available at https://github.com/medfound/medfound.
Similar Papers
Disentangling Reasoning and Knowledge in Medical Large Language Models
Computation and Language
Helps AI doctors think better, not just remember.
A Specialized Large Language Model for Clinical Reasoning and Diagnosis in Rare Diseases
Computation and Language
Helps doctors find rare diseases faster.
Generalist Large Language Models Outperform Clinical Tools on Medical Benchmarks
Computation and Language
New AI helps doctors more than old AI.