Fairness-Aware Data Augmentation for Cardiac MRI using Text-Conditioned Diffusion Models
Grzegorz Skorupko, Richard Osuala, Zuzanna Szafranowska, Kaisar Kushibar, Vien Ngoc Dang, Nay Aung, Steffen E Petersen, Karim Lekadir, Polyxeni Gkontra

TL;DR
This paper introduces a method using text-conditioned diffusion models to generate synthetic cardiac MRI data, aiming to balance datasets and improve fairness in disease classification, especially for underrepresented groups.
Contribution
The work presents a novel application of ControlNet diffusion models conditioned on patient metadata to generate realistic synthetic cardiac MRI images for dataset balancing.
Findings
Synthetic data effectively reduces class imbalances.
Improved classifier fairness on underrepresented groups.
Method is feasible with consumer-level GPU hardware.
Abstract
While deep learning holds great promise for disease diagnosis and prognosis in cardiac magnetic resonance imaging, its progress is often constrained by highly imbalanced and biased training datasets. To address this issue, we propose a method to alleviate imbalances inherent in datasets through the generation of synthetic data based on sensitive attributes such as sex, age, body mass index (BMI), and health condition. We adopt ControlNet based on a denoising diffusion probabilistic model to condition on text assembled from patient metadata and cardiac geometry derived from segmentation masks. We assess our method using a large-cohort study from the UK Biobank by evaluating the realism of the generated images using established quantitative metrics. Furthermore, we conduct a downstream classification task aimed at debiasing a classifier by rectifying imbalances within underrepresented…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiomics and Machine Learning in Medical Imaging · Medical Imaging Techniques and Applications · Machine Learning in Healthcare
MethodsDiffusion
