The Impact of Racial Distribution in Training Data on Face Recognition Bias: A Closer Look
Manideep Kolla, Aravinth Savadamuthu

TL;DR
This paper investigates how the racial composition of training data influences bias in face recognition systems, revealing that uniform racial distribution alone does not eliminate bias and highlighting the importance of face quality and data characteristics.
Contribution
It provides a comprehensive analysis of racial distribution effects, introduces the racial gradation metric, and evaluates the correlation between clustering and bias in face recognition models.
Findings
Uniform racial distribution does not guarantee bias-free recognition.
Face image quality significantly impacts model bias.
Clustering metrics have limited correlation with bias.
Abstract
Face recognition algorithms, when used in the real world, can be very useful, but they can also be dangerous when biased toward certain demographics. So, it is essential to understand how these algorithms are trained and what factors affect their accuracy and fairness to build better ones. In this study, we shed some light on the effect of racial distribution in the training data on the performance of face recognition models. We conduct 16 different experiments with varying racial distributions of faces in the training data. We analyze these trained models using accuracy metrics, clustering metrics, UMAP projections, face quality, and decision thresholds. We show that a uniform distribution of races in the training datasets alone does not guarantee bias-free face recognition algorithms and how factors like face image quality play a crucial role. We also study the correlation between the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Face and Expression Recognition · Survey Sampling and Estimation Techniques
