TL;DR
This paper introduces a quick, annotation-free method for evaluating face recognition accuracy and bias, enabling easier and more responsible testing of FR systems using minimal data and revealing demographic disparities.
Contribution
The authors present a novel, embedding-based verification method that benchmarks FR systems rapidly without manual labels, and provide the first public bias benchmark across multiple cloud services.
Findings
The method accurately estimates FR performance with smaller datasets.
Demographic biases, especially lower accuracy for Asian women, are identified.
The approach reduces testing time and costs significantly.
Abstract
Measuring the accuracy of face recognition (FR) systems is essential for improving performance and ensuring responsible use. Accuracy is typically estimated using large annotated datasets, which are costly and difficult to obtain. We propose a novel method for 1:1 face verification that benchmarks FR systems quickly and without manual annotation, starting from approximate labels (e.g., from web search results). Unlike previous methods for training set label cleaning, ours leverages the embedding representation of the models being evaluated, achieving high accuracy in smaller-sized test datasets. Our approach reliably estimates FR accuracy and ranking, significantly reducing the time and cost of manual labeling. We also introduce the first public benchmark of five FR cloud services, revealing demographic biases, particularly lower accuracy for Asian women. Our rapid test method can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSparse Evolutionary Training
