Multimodal Pretraining of Medical Time Series and Notes
Ryan King, Tianbao Yang, Bobak Mortazavi

TL;DR
This paper introduces a self-supervised pretraining method for ICU data that aligns clinical measurements with notes, significantly improving performance on mortality prediction and phenotyping with limited labeled data.
Contribution
It presents a novel self-supervised pretraining approach combining contrastive and masked token tasks for multimodal ICU data analysis.
Findings
Outperforms baselines in low-label settings
Increases AUC-ROC for mortality prediction by 0.17 with 1% labels
Enhances phenotyping accuracy with a 0.1 AUC-PR gain
Abstract
Within the intensive care unit (ICU), a wealth of patient data, including clinical measurements and clinical notes, is readily available. This data is a valuable resource for comprehending patient health and informing medical decisions, but it also contains many challenges in analysis. Deep learning models show promise in extracting meaningful patterns, but they require extensive labeled data, a challenge in critical care. To address this, we propose a novel approach employing self-supervised pretraining, focusing on the alignment of clinical measurements and notes. Our approach combines contrastive and masked token prediction tasks during pretraining. Semi-supervised experiments on the MIMIC-III dataset demonstrate the effectiveness of our self-supervised pretraining. In downstream tasks, including in-hospital mortality prediction and phenotyping, our pretrained model outperforms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Time Series Analysis and Forecasting
