CheXGenBench: A Unified Benchmark For Fidelity, Privacy and Utility of Synthetic Chest Radiographs

Raman Dutt; Pedro Sanchez; Yongchen Yao; Steven McDonagh; Sotirios A. Tsaftaris; Timothy Hospedales

arXiv:2505.10496·cs.CV·March 30, 2026

CheXGenBench: A Unified Benchmark For Fidelity, Privacy and Utility of Synthetic Chest Radiographs

Raman Dutt, Pedro Sanchez, Yongchen Yao, Steven McDonagh, Sotirios A. Tsaftaris, Timothy Hospedales

PDF

2 Repos 1 Models 1 Datasets

TL;DR

CheXGenBench is a comprehensive evaluation framework for synthetic chest radiograph generation, assessing fidelity, privacy, and clinical utility across multiple models, and providing a new benchmark dataset.

Contribution

It introduces a standardized, multifaceted benchmark for evaluating medical image synthesis models, addressing previous inconsistencies and enabling fair comparisons.

Findings

01

Existing evaluation protocols are inefficient and inconsistent.

02

The framework reveals critical gaps in current generative model assessments.

03

SynthCheX-75K dataset supports further research in medical image synthesis.

Abstract

We introduce CheXGenBench, a rigorous and multifaceted evaluation framework for synthetic chest radiograph generation that simultaneously assesses fidelity, privacy risks, and clinical utility across state-of-the-art text-to-image generative models. Despite rapid advancements in generative AI for real-world imagery, medical domain evaluations have been hindered by methodological inconsistencies, outdated architectural comparisons, and disconnected assessment criteria that rarely address the practical clinical value of synthetic samples. CheXGenBench overcomes these limitations through standardised data partitioning and a unified evaluation protocol comprising over 20 quantitative metrics that systematically analyse generation quality, potential privacy vulnerabilities, and downstream clinical applicability across 11 leading text-to-image architectures. Our results reveal critical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

🤗
raman07/CheXGenBench-Models-Sana-e20
model· 11 dl
11 dl

Datasets

raman07/SynthCheX-75K-v2
dataset· 930 dl
930 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.