GMAI-VL-R1: Harnessing Reinforcement Learning for Multimodal Medical Reasoning
By: Yanzhou Su , Tianbin Li , Jiyao Liu and more
Potential Business Impact:
Helps doctors diagnose sickness better using AI.
Recent advances in general medical AI have made significant strides, but existing models often lack the reasoning capabilities needed for complex medical decision-making. This paper presents GMAI-VL-R1, a multimodal medical reasoning model enhanced by reinforcement learning (RL) to improve its reasoning abilities. Through iterative training, GMAI-VL-R1 optimizes decision-making, significantly boosting diagnostic accuracy and clinical support. We also develop a reasoning data synthesis method, generating step-by-step reasoning data via rejection sampling, which further enhances the model's generalization. Experimental results show that after RL training, GMAI-VL-R1 excels in tasks such as medical image diagnosis and visual question answering. While the model demonstrates basic memorization with supervised fine-tuning, RL is crucial for true generalization. Our work establishes new evaluation benchmarks and paves the way for future advancements in medical reasoning models. Code, data, and model will be released at \href{https://github.com/uni-medical/GMAI-VL-R1}{this link}.
Similar Papers
RARL: Improving Medical VLM Reasoning and Generalization with Reinforcement Learning and LoRA under Data and Hardware Constraints
CV and Pattern Recognition
Helps doctors understand medical pictures better.
Med-R1: Reinforcement Learning for Generalizable Medical Reasoning in Vision-Language Models
CV and Pattern Recognition
Helps doctors understand X-rays better and faster.
MMedAgent-RL: Optimizing Multi-Agent Collaboration for Multimodal Medical Reasoning
Machine Learning (CS)
Helps doctors diagnose illnesses better by working together.