Score: 1

ClinicalGPT-R1: Pushing reasoning capability of generalist disease diagnosis with large language model

Published: April 13, 2025 | arXiv ID: 2504.09421v2

By: Wuyang Lan , Wenzheng Wang , Changwei Ji and more

Potential Business Impact:

Helps doctors diagnose illnesses better using AI.

Business Areas:
Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Recent advances in reasoning with large language models (LLMs)has shown remarkable reasoning capabilities in domains such as mathematics and coding, yet their application to clinical diagnosis remains underexplored. Here, we introduce ClinicalGPT-R1, a reasoning enhanced generalist large language model for disease diagnosis. Trained on a dataset of 20,000 real-world clinical records, ClinicalGPT-R1 leverages diverse training strategies to enhance diagnostic reasoning. To benchmark performance, we curated MedBench-Hard, a challenging dataset spanning seven major medical specialties and representative diseases. Experimental results demonstrate that ClinicalGPT-R1 outperforms GPT-4o in Chinese diagnostic tasks and achieves comparable performance to GPT-4 in English settings. This comparative study effectively validates the superior performance of ClinicalGPT-R1 in disease diagnosis tasks. Resources are available at https://github.com/medfound/medfound.

Repos / Data Links

Page Count
8 pages

Category
Computer Science:
Computation and Language