A Deep Dive into Dataset Imbalance and Bias in Face Identification
Valeriia Cherepanova, Steven Reich, Samuel Dooley, Hossein Souri,, Micah Goldblum, Tom Goldstein

TL;DR
This paper investigates how various types of data imbalance affect bias in face identification systems, highlighting complexities beyond training data imbalance and emphasizing the importance of testing data and demographic proportions.
Contribution
It provides a comprehensive analysis of different imbalance types in face identification and discusses additional factors influencing bias, filling a research gap left by prior studies.
Findings
Different imbalance types impact face identification bias in distinct ways
Testing data imbalance can significantly influence system fairness
Demographic proportions and image counts per identity affect bias levels
Abstract
As the deployment of automated face recognition (FR) systems proliferates, bias in these systems is not just an academic question, but a matter of public concern. Media portrayals often center imbalance as the main source of bias, i.e., that FR models perform worse on images of non-white people or women because these demographic groups are underrepresented in training data. Recent academic research paints a more nuanced picture of this relationship. However, previous studies of data imbalance in FR have focused exclusively on the face verification setting, while the face identification setting has been largely ignored, despite being deployed in sensitive applications such as law enforcement. This is an unfortunate omission, as 'imbalance' is a more complex matter in identification; imbalance may arise in not only the training data, but also the testing data, and furthermore may affect…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Face and Expression Recognition · Biometric Identification and Security
