EHR-RAG: Bridging Long-Horizon Structured Electronic Health Records and Large Language Models via Enhanced Retrieval-Augmented Generation
Lang Cao, Qingyu Chen, Yue Guo

TL;DR
EHR-RAG is a novel retrieval-augmented framework that enhances large language models' ability to interpret long-horizon structured electronic health records by preserving clinical structure, refining evidence retrieval, and jointly reasoning over factual and counterfactual data.
Contribution
The paper introduces EHR-RAG, a new framework with specialized retrieval and reasoning components tailored for long-term clinical EHR prediction tasks, outperforming existing LLM baselines.
Findings
EHR-RAG achieves an average Macro-F1 improvement of 10.76% over baselines.
The framework effectively preserves temporal and clinical structure in EHR data.
EHR-RAG outperforms existing methods across four long-horizon EHR prediction tasks.
Abstract
Electronic Health Records (EHRs) provide rich longitudinal clinical evidence that is central to medical decision-making, motivating the use of retrieval-augmented generation (RAG) to ground large language model (LLM) predictions. However, long-horizon EHRs often exceed LLM context limits, and existing approaches commonly rely on truncation or vanilla retrieval strategies that discard clinically relevant events and temporal dependencies. To address these challenges, we propose EHR-RAG, a retrieval-augmented framework designed for accurate interpretation of long-horizon structured EHR data. EHR-RAG introduces three components tailored to longitudinal clinical prediction tasks: Event- and Time-Aware Hybrid EHR Retrieval to preserve clinical structure and temporal dynamics, Adaptive Iterative Retrieval to progressively refine queries in order to expand broad evidence coverage, and Dual-Path…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Topic Modeling · Artificial Intelligence in Healthcare and Education
