CEHR-XGPT: A Scalable Multi-Task Foundation Model for Electronic Health Records
Chao Pang, Jiheum Park, Xinzhuo Jiang, Nishanth Parameshwar Pavinkurve, Krishna S. Kalluri, Shalmali Joshi, No\'emie Elhadad, Karthik Natarajan

TL;DR
CEHR-XGPT is a versatile foundation model for Electronic Health Records that unifies feature representation, zero-shot prediction, and synthetic data generation, enabling broad clinical applications with strong performance and generalizability.
Contribution
The paper introduces CEHR-XGPT, a novel multi-task foundation model for EHRs that incorporates a time-token-based framework for temporal reasoning and supports multiple clinical tasks within a single architecture.
Findings
Strong performance across feature representation, prediction, and data generation tasks.
Effective generalization to external datasets through vocabulary expansion and fine-tuning.
Enables rapid development of clinical models without task-specific retraining.
Abstract
Electronic Health Records (EHRs) provide a rich, longitudinal view of patient health and hold significant potential for advancing clinical decision support, risk prediction, and data-driven healthcare research. However, most artificial intelligence (AI) models for EHRs are designed for narrow, single-purpose tasks, limiting their generalizability and utility in real-world settings. Here, we present CEHR-XGPT, a general-purpose foundation model for EHR data that unifies three essential capabilities - feature representation, zero-shot prediction, and synthetic data generation - within a single architecture. To support temporal reasoning over clinical sequences, CEHR-XGPT incorporates a novel time-token-based learning framework that explicitly encodes patients' dynamic timelines into the model structure. CEHR-XGPT demonstrates strong performance across all three tasks and generalizes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare
