Can we generate portable representations for clinical time series data using LLMs?

Zongliang Ji; Yifei Sun; Andre Amaral; Anna Goldenberg; Rahul G. Krishnan

arXiv:2603.23987·cs.LG·April 21, 2026

Can we generate portable representations for clinical time series data using LLMs?

Zongliang Ji, Yifei Sun, Andre Amaral, Anna Goldenberg, Rahul G. Krishnan

PDF

1 Video

TL;DR

This study explores using large language models to create portable, hospital-agnostic patient embeddings from ICU time series data, enhancing model transferability and reducing deployment complexity.

Contribution

The paper demonstrates a simple method to generate portable patient representations from ICU data using LLMs, improving transferability across hospitals with minimal fine-tuning.

Findings

01

Approach is competitive with existing methods in in-distribution settings.

02

Portable embeddings show smaller performance drops when transferred to new hospitals.

03

Structured prompts reduce variance in predictive performance.

Abstract

Deploying clinical ML is slow and brittle: models that work at one hospital often degrade under distribution shifts at the next. In this work, we study a simple question -- can large language models (LLMs) create portable patient embeddings i.e. representations of patients enable a downstream predictor built on one hospital to be used elsewhere with minimal-to-no retraining and fine-tuning. To do so, we map from irregular ICU time series onto concise natural language summaries using a frozen LLM, then embed each summary with a frozen text embedding model to obtain a fixed length vector capable of serving as input to a variety of downstream predictors. Across three cohorts (MIMIC-IV, HIRID, PPICU), on multiple clinically grounded forecasting and classification tasks, we find that our approach is simple, easy to use and competitive with in-distribution with grid imputation,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Can we generate portable representations for clinical time series data using LLMs?· slideslive