Learning temporal embeddings from electronic health records of chronic kidney disease patients
Aditya Kumar, Mario A. Cypko, Oliver Amft

TL;DR
This study demonstrates that temporal embedding models trained on electronic health records can produce meaningful, task-agnostic representations that improve clinical predictions, with architectural choices significantly influencing embedding quality.
Contribution
It compares recurrent architectures for learning embeddings from EHR data, showing T-LSTM produces more structured embeddings and that embedding learning outperforms end-to-end prediction.
Findings
T-LSTM yields lower Davies-Bouldin Index and higher CKD stage accuracy.
Embedding models outperform end-to-end predictors in mortality prediction.
Learning embeddings as an intermediate step improves predictive accuracy.
Abstract
We investigate whether temporal embedding models trained on longitudinal electronic health records can learn clinically meaningful representations without compromising predictive performance, and how architectural choices affect embedding quality. Model-guided medicine requires representations that capture disease dynamics while remaining transparent and task agnostic, whereas most clinical prediction models are optimised for a single task. Representation learning facilitates learning embeddings that generalise across downstream tasks, and recurrent architectures are well-suited for modelling temporal structure in observational clinical data. Using the MIMIC-IV dataset, we study patients with chronic kidney disease (CKD) and compare three recurrent architectures: a vanilla LSTM, an attention-augmented LSTM, and a time-aware LSTM (T-LSTM). All models are trained both as embedding models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
