Improving Performance, Robustness, and Fairness of Radiographic AI Models with Finely-Controllable Synthetic Data

Stefania L. Moroianu; Christian Bluethgen; Pierre Chambon; Mehdi Cherti; Jean-Benoit Delbrouck; Magdalini Paschali; Brandon Price; Judy Gichoya; Jenia Jitsev; Curtis P. Langlotz; Akshay S. Chaudhari

arXiv:2508.16783·cs.CV·August 26, 2025

Improving Performance, Robustness, and Fairness of Radiographic AI Models with Finely-Controllable Synthetic Data

Stefania L. Moroianu, Christian Bluethgen, Pierre Chambon, Mehdi Cherti, Jean-Benoit Delbrouck, Magdalini Paschali, Brandon Price, Judy Gichoya, Jenia Jitsev, Curtis P. Langlotz, Akshay S. Chaudhari

PDF

1 Models 1 Datasets

TL;DR

This paper introduces RoentGen-v2, a controllable synthetic data generator for chest radiographs that improves deep learning model performance, robustness, and fairness across diverse patient demographics by using a novel training strategy.

Contribution

We develop RoentGen-v2, the first demographic-conditioned diffusion model for clinically plausible chest radiograph synthesis, and demonstrate its effectiveness in enhancing model performance and fairness.

Findings

01

Synthetic pretraining improves classification accuracy by 6.5%.

02

Fairness gap reduces by 19.3% with synthetic pretraining.

03

Synthetic data enhances model generalization across institutions.

Abstract

Achieving robust performance and fairness across diverse patient populations remains a challenge in developing clinically deployable deep learning models for diagnostic imaging. Synthetic data generation has emerged as a promising strategy to address limitations in dataset scale and diversity. We introduce RoentGen-v2, a text-to-image diffusion model for chest radiographs that enables fine-grained control over both radiographic findings and patient demographic attributes, including sex, age, and race/ethnicity. RoentGen-v2 is the first model to generate clinically plausible images with demographic conditioning, facilitating the creation of a large, demographically balanced synthetic dataset comprising over 565,000 images. We use this large synthetic dataset to evaluate optimal training pipelines for downstream disease classification models. In contrast to prior work that combines real…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
stanfordmimi/RoentGen-v2
model· 217 dl· ♡ 4
217 dl♡ 4

Datasets

stanfordmimi/RoentGen-v2-synthetic-dataset
dataset· 44 dl
44 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.