Medical Reasoning in the Era of LLMs: A Systematic Review of Enhancement Techniques and Applications
By: Wenxuan Wang , Zizhan Ma , Meidan Ding and more
Potential Business Impact:
Boosts AI's step-by-step medical thinking
The proliferation of Large Language Models (LLMs) in medicine has enabled impressive capabilities, yet a critical gap remains in their ability to perform systematic, transparent, and verifiable reasoning, a cornerstone of clinical practice. This has catalyzed a shift from single-step answer generation to the development of LLMs explicitly designed for medical reasoning. This paper provides the first systematic review of this emerging field. We propose a taxonomy of reasoning enhancement techniques, categorized into training-time strategies (e.g., supervised fine-tuning, reinforcement learning) and test-time mechanisms (e.g., prompt engineering, multi-agent systems). We analyze how these techniques are applied across different data modalities (text, image, code) and in key clinical applications such as diagnosis, education, and treatment planning. Furthermore, we survey the evolution of evaluation benchmarks from simple accuracy metrics to sophisticated assessments of reasoning quality and visual interpretability. Based on an analysis of 60 seminal studies from 2022-2025, we conclude by identifying critical challenges, including the faithfulness-plausibility gap and the need for native multimodal reasoning, and outlining future directions toward building efficient, robust, and sociotechnically responsible medical AI.
Similar Papers
Reasoning LLMs in the Medical Domain: A Literature Survey
Artificial Intelligence
Helps doctors make better health choices.
Training and Evaluation of Guideline-Based Medical Reasoning in LLMs
Computation and Language
Teaches computers to explain medical decisions like doctors.
Advances in LLMs with Focus on Reasoning, Adaptability, Efficiency and Ethics
Computation and Language
Makes computers smarter, faster, and more helpful.