One Loss to Rule Them All: Marked Time-to-Event for Structured EHR Foundation Models
Zilin Jing, Vincent Jeanselme, Yuta Kobayashi, Simon A. Lee, Chao Pang, Aparajita Kashyap, Yanwei Li, Xinzhuo Jiang, Shalmali Joshi

TL;DR
This paper introduces ORA, a novel pretraining objective for EHR foundation models that jointly models event timing and measurements, leading to improved generalization and downstream task performance.
Contribution
The paper proposes a new marked time-to-event pretraining objective that better captures EHR data structure compared to traditional next-token prediction.
Findings
Improved performance on regression and time-to-event tasks
More generalizable representations across datasets and models
Outperforms existing pretraining methods in EHR modeling
Abstract
Clinical events captured in Electronic Health Records (EHR) are irregularly sampled and may consist of a mixture of discrete events and numerical measurements, such as laboratory values or treatment dosages. The sequential nature of EHR, analogous to natural language, has motivated the use of next-token prediction to train prior EHR Foundation Models (FMs) over events. However, this training fails to capture the full structure of EHR. We propose ORA, a marked time-to-event pretraining objective that jointly models event timing and associated measurements. Across multiple datasets, downstream tasks, and model architectures, this objective consistently yields more generalizable representations than next-token prediction and pretraining losses that ignore continuous measurements. Importantly, the proposed objective yields improvements beyond traditional classification evaluation, including…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Electronic Health Records Systems · Artificial Intelligence in Healthcare and Education
