Bt-GAN: Generating Fair Synthetic Healthdata via Bias-transforming Generative Adversarial Networks
Resmi Ramachandranpillai, Md Fahim Sikder, David Bergstr\"om, Fredrik, Heintz

TL;DR
Bt-GAN is a novel GAN-based method that generates fair synthetic healthcare data, addressing biases and improving fairness in downstream predictions while maintaining high data quality.
Contribution
The paper introduces Bt-GAN, a GAN framework that incorporates fairness constraints and sub-group density preservation for more equitable synthetic health data generation.
Findings
Achieves state-of-the-art accuracy on MIMIC-III
Significantly improves fairness and reduces bias amplification
Provides explainability analysis supporting its effectiveness
Abstract
Synthetic data generation offers a promising solution to enhance the usefulness of Electronic Healthcare Records (EHR) by generating realistic de-identified data. However, the existing literature primarily focuses on the quality of synthetic health data, neglecting the crucial aspect of fairness in downstream predictions. Consequently, models trained on synthetic EHR have faced criticism for producing biased outcomes in target tasks. These biases can arise from either spurious correlations between features or the failure of models to accurately represent sub-groups. To address these concerns, we present Bias-transforming Generative Adversarial Networks (Bt-GAN), a GAN-based synthetic data generator specifically designed for the healthcare domain. In order to tackle spurious correlations (i), we propose an information-constrained Data Generation Process that enables the generator to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
