EHR-R1: A Reasoning-Enhanced Foundational Language Model for Electronic Health Record Analysis

Yusheng Liao; Chaoyi Wu; Junwei Liu; Shuyang Jiang; Pengcheng Qiu; Haowen Wang; Yun Yue; Shuai Zhen; Jian Wang; Qianrui Fan; Jinjie Gu; Ya Zhang; Yanfeng Wang; Yu Wang; Weidi Xie

arXiv:2510.25628·cs.CL·November 26, 2025

EHR-R1: A Reasoning-Enhanced Foundational Language Model for Electronic Health Record Analysis

Yusheng Liao, Chaoyi Wu, Junwei Liu, Shuyang Jiang, Pengcheng Qiu, Haowen Wang, Yun Yue, Shuai Zhen, Jian Wang, Qianrui Fan, Jinjie Gu, Ya Zhang, Yanfeng Wang, Yu Wang, Weidi Xie

PDF

2 Models 2 Datasets

TL;DR

This paper introduces EHR-R1, a reasoning-enhanced language model tailored for electronic health record analysis, supported by a large reasoning dataset and a new benchmark, significantly improving clinical decision-making capabilities.

Contribution

The paper presents EHR-Ins, a large EHR reasoning dataset, and EHR-R1, a series of reasoning-enhanced LLMs, along with EHR-Bench for comprehensive evaluation, advancing EHR analysis technology.

Findings

01

EHR-R1 outperforms GPT-4o by over 30 points on MIMIC-Bench.

02

EHR-R1 achieves 10% higher zero-shot AUROC on EHRSHOT.

03

The framework enables scalable high-quality reasoning data generation.

Abstract

Electronic Health Records (EHRs) contain rich yet complex information, and their automated analysis is critical for clinical decision-making. Despite recent advances of large language models (LLMs) in clinical workflows, their ability to analyze EHRs remains limited due to narrow task coverage and lack of EHR-oriented reasoning capabilities. This paper aims to bridge the gap, specifically, we present EHR-Ins, a large-scale, comprehensive EHR reasoning instruction dataset, comprising 300k high-quality reasoning cases and 4M non-reasoning cases across 42 distinct EHR tasks. Its core innovation is a thinking-graph-driven framework that enables to generate high-quality reasoning data at scale. Based on it, we develop EHR-R1, a series of reasoning-enhanced LLMs with up to 72B parameters tailored for EHR analysis. Through a multi-stage training paradigm, including domain adaptation, reasoning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.