Improving Face Recognition by Clustering Unlabeled Faces in the Wild
Aruni RoyChowdhury, Xiang Yu, Kihyuk Sohn, Erik Learned-Miller,, Manmohan Chandraker

TL;DR
This paper introduces a novel method for improving face recognition by effectively clustering unlabeled faces, addressing identity overlaps and clustering errors, resulting in significant accuracy gains in large-scale, real-world datasets.
Contribution
It proposes a new identity separation technique based on extreme value theory and a cosine loss modulation to handle overlapping identities and clustering noise in unlabeled face data.
Findings
11.6% improvement on IJB-A verification
Effective reduction of label noise from overlapping identities
Consistent performance gains in controlled and real settings
Abstract
While deep face recognition has benefited significantly from large-scale labeled data, current research is focused on leveraging unlabeled data to further boost performance, reducing the cost of human annotation. Prior work has mostly been in controlled settings, where the labeled and unlabeled data sets have no overlapping identities by construction. This is not realistic in large-scale face recognition, where one must contend with such overlaps, the frequency of which increases with the volume of data. Ignoring identity overlap leads to significant labeling noise, as data from the same identity is split into multiple clusters. To address this, we propose a novel identity separation method based on extreme value theory. It is formulated as an out-of-distribution detection algorithm, and greatly reduces the problems caused by overlapping-identity label noise. Considering cluster…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsFocal Loss
