Score: 1

MedGRPO: Multi-Task Reinforcement Learning for Heterogeneous Medical Video Understanding

Published: December 6, 2025 | arXiv ID: 2512.06581v1

By: Yuhao Su , Anwesa Choudhuri , Zhongpai Gao and more

Potential Business Impact:

Helps computers understand medical videos better.

Business Areas:

Image Recognition Data and Analytics, Software

Large vision-language models struggle with medical video understanding, where spatial precision, temporal reasoning, and clinical semantics are critical. To address this, we first introduce \textbf{MedVidBench}, a large-scale benchmark of 531,850 video-instruction pairs across 8 medical sources spanning video, segment, and frame-level tasks, curated through a rigorous quality assurance pipeline with expert-guided prompting and dual-model validation. While supervised fine-tuning on MedVidBench yields noticeable gains, standard Reinforcement Learning (RL) fails due to imbalanced reward scales across datasets, which destabilizes optimization and leads to training collapse. To overcome this, we introduce \textbf{MedGRPO}, a novel RL framework for balanced multi-dataset training with two key innovations: (1) \emph{cross-dataset reward normalization} that maps each dataset's median performance to a common reward value, ensuring fair optimization regardless of difficulty, and (2) a \emph{medical LLM judge} that evaluates caption quality on five clinical dimensions through comparative similarity scoring. Supervised fine-tuning Qwen2.5-VL-7B on MedVidBench substantially outperforms GPT-4.1 and Gemini-2.5-Flash across all tasks, demonstrating MedVidBench's efficacy, while our MedGRPO framework further improves upon the SFT baseline across grounding and captioning tasks. Our work establishes a foundational benchmark and robust training methodology for advancing vision-language models in medical domains. Our project website is available at https://yuhaosu.github.io/MedGRPO/.

MedGR$^2$: Breaking the Data Barrier for Medical Reasoning via Generative Reward Learning

Machine Learning (CS)

Makes AI learn medicine from generated data.

28 Aug 2025 1

91%

FairGRPO: Fair Reinforcement Learning for Equitable Clinical Reasoning

Machine Learning (CS)

Makes AI doctors treat everyone fairly.

22 Oct 2025 1

91%

Med-R1: Reinforcement Learning for Generalizable Medical Reasoning in Vision-Language Models

CV and Pattern Recognition

Helps doctors understand X-rays better and faster.

18 Mar 2025 0

View PDF Login to Bookmark

Page Count

19 pages

MedGRPO: Multi-Task Reinforcement Learning for Heterogeneous Medical Video Understanding

Helps computers understand medical videos better.

Technical Abstract

MedGR$^2$: Breaking the Data Barrier for Medical Reasoning via Generative Reward Learning

FairGRPO: Fair Reinforcement Learning for Equitable Clinical Reasoning

Med-R1: Reinforcement Learning for Generalizable Medical Reasoning in Vision-Language Models