PiCME: Pipeline for Contrastive Modality Evaluation and Encoding in the MIMIC Dataset

Michal Golovanevsky; Pranav Mahableshwarkar; Carsten Eickhoff; Ritambhara Singh

arXiv:2507.03165·cs.LG·July 8, 2025

PiCME: Pipeline for Contrastive Modality Evaluation and Encoding in the MIMIC Dataset

Michal Golovanevsky, Pranav Mahableshwarkar, Carsten Eickhoff, Ritambhara Singh

PDF

TL;DR

PiCME introduces a systematic contrastive learning pipeline for multimodal clinical data in MIMIC, improving predictive performance and interpretability across various modality combinations, with a focus on fairness and scalability.

Contribution

This work is the first to scale contrastive learning across all modality combinations in MIMIC, providing insights into modality importance, training strategies, and equitable clinical prediction.

Findings

01

Contrastive models perform well in three-modality settings.

02

Modality-Gated LSTM improves performance in five-modality scenarios.

03

Contrastive importance scores align with attribution scores and enhance fairness.

Abstract

Multimodal deep learning holds promise for improving clinical prediction by integrating diverse patient data, including text, imaging, time-series, and structured demographics. Contrastive learning facilitates this integration by producing a unified representation that can be reused across tasks, reducing the need for separate models or encoders. Although contrastive learning has seen success in vision-language domains, its use in clinical settings remains largely limited to image and text pairs. We propose the Pipeline for Contrastive Modality Evaluation and Encoding (PiCME), which systematically assesses five clinical data types from MIMIC: discharge summaries, radiology reports, chest X-rays, demographics, and time-series. We pre-train contrastive models on all 26 combinations of two to five modalities and evaluate their utility on in-hospital mortality and phenotype prediction. To…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.