Benchmarking CXR Foundation Models With Publicly Available MIMIC-CXR and NIH-CXR14 Datasets

Jiho Shin; Dominic Marshall; Matthieu Komorowski

arXiv:2512.06014·cs.CV·December 9, 2025

Benchmarking CXR Foundation Models With Publicly Available MIMIC-CXR and NIH-CXR14 Datasets

Jiho Shin, Dominic Marshall, Matthieu Komorowski

PDF

Open Access

TL;DR

This paper benchmarks two large-scale chest X-ray foundation models on public datasets, comparing their performance, stability, and disease-specific embedding structures to establish reproducible evaluation standards.

Contribution

It provides a standardized benchmarking framework for CXR foundation models using public datasets, highlighting differences in performance and stability.

Findings

01

MedImageInsight achieved slightly higher performance

02

CXR-Foundation showed strong cross-dataset stability

03

Embeddings revealed disease-specific structures

Abstract

Recent foundation models have demonstrated strong performance in medical image representation learning, yet their comparative behaviour across datasets remains underexplored. This work benchmarks two large-scale chest X-ray (CXR) embedding models (CXR-Foundation (ELIXR v2.0) and MedImagelnsight) on public MIMIC-CR and NIH ChestX-ray14 datasets. Each model was evaluated using a unified preprocessing pipeline and fixed downstream classifiers to ensure reproducible comparison. We extracted embeddings directly from pre-trained encoders, trained lightweight LightGBM classifiers on multiple disease labels, and reported mean AUROC, and F1-score with 95% confidence intervals. MedImageInsight achieved slightly higher performance across most tasks, while CXR-Foundation exhibited strong cross-dataset stability. Unsupervised clustering of MedImageIn-sight embeddings further revealed a coherent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCOVID-19 diagnosis using AI · AI in cancer detection · Radiomics and Machine Learning in Medical Imaging