TL;DR
This study explores using large language models to create portable, hospital-agnostic patient embeddings from ICU time series data, enhancing model transferability and reducing deployment complexity.
Contribution
The paper demonstrates a simple method to generate portable patient representations from ICU data using LLMs, improving transferability across hospitals with minimal fine-tuning.
Findings
Approach is competitive with existing methods in in-distribution settings.
Portable embeddings show smaller performance drops when transferred to new hospitals.
Structured prompts reduce variance in predictive performance.
Abstract
Deploying clinical ML is slow and brittle: models that work at one hospital often degrade under distribution shifts at the next. In this work, we study a simple question -- can large language models (LLMs) create portable patient embeddings i.e. representations of patients enable a downstream predictor built on one hospital to be used elsewhere with minimal-to-no retraining and fine-tuning. To do so, we map from irregular ICU time series onto concise natural language summaries using a frozen LLM, then embed each summary with a frozen text embedding model to obtain a fixed length vector capable of serving as input to a variety of downstream predictors. Across three cohorts (MIMIC-IV, HIRID, PPICU), on multiple clinically grounded forecasting and classification tasks, we find that our approach is simple, easy to use and competitive with in-distribution with grid imputation,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
