TL;DR
This paper investigates how GAN-based data augmentation affects bias in skin lesion classification, revealing that GANs can inherit and amplify biases, impacting model fairness and accuracy.
Contribution
The study compares unconditional and conditional GANs in terms of bias inheritance and provides a detailed analysis of bias effects on skin lesion classification.
Findings
GANs inherit and sometimes amplify biases
Synthetic data can reinforce spurious correlations
Manual annotations and datasets are publicly available
Abstract
New medical datasets are now more open to the public, allowing for better and more extensive research. Although prepared with the utmost care, new datasets might still be a source of spurious correlations that affect the learning process. Moreover, data collections are usually not large enough and are often unbalanced. One approach to alleviate the data imbalance is using data augmentation with Generative Adversarial Networks (GANs) to extend the dataset with high-quality images. GANs are usually trained on the same biased datasets as the target data, resulting in more biased instances. This work explored unconditional and conditional GANs to compare their bias inheritance and how the synthetic data influenced the models. We provided extensive manual data annotation of possibly biasing artifacts on the well-known ISIC dataset with skin lesions. In addition, we examined classification…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
