Balanced Face Dataset: Guiding StyleGAN to Generate Labeled Synthetic Face Image Dataset for Underrepresented Group
Kidist Amde Mekonnen

TL;DR
This paper presents a method to generate a balanced synthetic face dataset using StyleGAN, aiming to improve fairness and reduce bias in machine learning models by controlling demographic representation.
Contribution
It introduces a technique to guide StyleGAN in producing labeled, demographically balanced face images for training more equitable machine learning models.
Findings
Generated a balanced synthetic face dataset across demographics.
Demonstrated improved fairness in models trained on the dataset.
Reduced bias in face recognition tasks.
Abstract
For a machine learning model to generalize effectively to unseen data within a particular problem domain, it is well-understood that the data needs to be of sufficient size and representative of real-world scenarios. Nonetheless, real-world datasets frequently have overrepresented and underrepresented groups. One solution to mitigate bias in machine learning is to leverage a diverse and representative dataset. Training a model on a dataset that covers all demographics is crucial to reducing bias in machine learning. However, collecting and labeling large-scale datasets has been challenging, prompting the use of synthetic data generation and active labeling to decrease the costs of manual labeling. The focus of this study was to generate a robust face image dataset using the StyleGAN model. In order to achieve a balanced distribution of the dataset among different demographic groups, a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · AI in cancer detection · COVID-19 diagnosis using AI
MethodsAdaptive Instance Normalization · R1 Regularization · HuMan(Expedia)||How do I get a human at Expedia? · Dense Connections · Convolution · Feedforward Network · StyleGAN · Focus
