Context Clues: Evaluating Long Context Models for Clinical Prediction   Tasks on EHRs

Michael Wornow; Suhana Bedi; Miguel Angel Fuentes Hernandez; Ethan; Steinberg; Jason Alan Fries; Christopher Re; Sanmi Koyejo; Nigam H. Shah

arXiv:2412.16178·cs.LG·March 20, 2025·2 cites

Context Clues: Evaluating Long Context Models for Clinical Prediction Tasks on EHRs

Michael Wornow, Suhana Bedi, Miguel Angel Fuentes Hernandez, Ethan, Steinberg, Jason Alan Fries, Christopher Re, Sanmi Koyejo, Nigam H. Shah

PDF

Open Access 2 Repos 10 Models

TL;DR

This paper systematically evaluates the impact of long context models, specifically Mamba-based architectures, on clinical prediction tasks using EHR data, demonstrating improved performance and robustness over shorter context models.

Contribution

It is the first to analyze the effect of extended context lengths on EHR modeling and introduces a comprehensive evaluation of model robustness to EHR-specific properties.

Findings

01

Longer context models improve predictive performance on EHR tasks.

02

Mamba-based models outperform previous state-of-the-art on most tasks.

03

Longer context models are more robust to EHR data peculiarities.

Abstract

Foundation Models (FMs) trained on Electronic Health Records (EHRs) have achieved state-of-the-art results on numerous clinical prediction tasks. However, most existing EHR FMs have context windows of <1k tokens. This prevents them from modeling full patient EHRs which can exceed 10k's of events. Recent advancements in subquadratic long-context architectures (e.g., Mamba) offer a promising solution. However, their application to EHR data has not been well-studied. We address this gap by presenting the first systematic evaluation of the effect of context length on modeling EHR data. We find that longer context models improve predictive performance -- our Mamba-based model surpasses the prior state-of-the-art on 9/14 tasks on the EHRSHOT prediction benchmark. For clinical applications, however, model performance alone is insufficient -- robustness to the unique properties of EHR is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Radiomics and Machine Learning in Medical Imaging