Score: 2

EHR-R1: A Reasoning-Enhanced Foundational Language Model for Electronic Health Record Analysis

Published: October 29, 2025 | arXiv ID: 2510.25628v1

By: Yusheng Liao , Chaoyi Wu , Junwei Liu and more

Potential Business Impact:

Helps doctors understand patient health records better.

Business Areas:

Electronic Health Record (EHR) Health Care

Electronic Health Records (EHRs) contain rich yet complex information, and their automated analysis is critical for clinical decision-making. Despite recent advances of large language models (LLMs) in clinical workflows, their ability to analyze EHRs remains limited due to narrow task coverage and lack of EHR-oriented reasoning capabilities. This paper aims to bridge the gap, specifically, we present EHR-Ins, a large-scale, comprehensive EHR reasoning instruction dataset, comprising 300k high-quality reasoning cases and 4M non-reasoning cases across 42 distinct EHR tasks. Its core innovation is a thinking-graph-driven framework that enables to generate high-quality reasoning data at scale. Based on it, we develop EHR-R1, a series of reasoning-enhanced LLMs with up to 72B parameters tailored for EHR analysis. Through a multi-stage training paradigm, including domain adaptation, reasoning enhancement, and reinforcement learning, EHR-R1 systematically acquires domain knowledge and diverse reasoning capabilities, enabling accurate and robust EHR analysis. Lastly, we introduce EHR-Bench, a new benchmark curated from MIMIC-IV, spanning 42 tasks, to comprehensively assess reasoning and prediction across EHR scenarios. In experiments, we show that the resulting EHR-R1 consistently outperforms state-of-the-art commercial and open-source LLMs (including DeepSeek-V3 and GPT-4o), surpassing GPT-4o by over 30 points on MIMIC-Bench and achieving a 10\% higher zero-shot AUROC on EHRSHOT. Collectively, EHR-Ins, EHR-R1, and EHR-Bench have significantly advanced the development for more reliable and clinically relevant EHR analysis.

ER-REASON: A Benchmark Dataset for LLM-Based Clinical Reasoning in the Emergency Room

Computation and Language

Tests AI doctors' emergency room smarts.

28 May 2025 3

91%

A Specialized Large Language Model for Clinical Reasoning and Diagnosis in Rare Diseases

Computation and Language

Helps doctors find rare diseases faster.

18 Nov 2025 2

91%

Training LLMs for EHR-Based Reasoning Tasks via Reinforcement Learning

Computation and Language

Helps doctors make better patient decisions.

30 May 2025 0

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Repos / Data Links

github.com github.com

Page Count

48 pages

EHR-R1: A Reasoning-Enhanced Foundational Language Model for Electronic Health Record Analysis

Helps doctors understand patient health records better.

Technical Abstract

ER-REASON: A Benchmark Dataset for LLM-Based Clinical Reasoning in the Emergency Room

A Specialized Large Language Model for Clinical Reasoning and Diagnosis in Rare Diseases

Training LLMs for EHR-Based Reasoning Tasks via Reinforcement Learning