Adversarial Reweighting for Speaker Verification Fairness
Minho Jin, Chelsea J.-T. Ju, Zeya Chen, Yi-Chieh Liu, Jasha Droppo,, and Andreas Stolcke

TL;DR
This paper introduces an adversarial reweighting approach for speaker verification that enhances fairness across gender and nationality groups without needing subgroup labels during training.
Contribution
It reformulates ARW for metric learning in speaker verification and demonstrates its effectiveness in reducing subgroup performance disparities.
Findings
Achieved 1.08% overall EER on VoxCeleb
Reduced gender EER gap from 0.70% to 0.58%
Lowered EER standard deviation across nationalities
Abstract
We address performance fairness for speaker verification using the adversarial reweighting (ARW) method. ARW is reformulated for speaker verification with metric learning, and shown to improve results across different subgroups of gender and nationality, without requiring annotation of subgroups in the training data. An adversarial network learns a weight for each training sample in the batch so that the main learner is forced to focus on poorly performing instances. Using a min-max optimization algorithm, this method improves overall speaker verification fairness. We present three different ARWformulations: accumulated pairwise similarity, pseudo-labeling, and pairwise weighting, and measure their performance in terms of equal error rate (EER) on the VoxCeleb corpus. Results show that the pairwise weighting method can achieve 1.08% overall EER, 1.25% for male and 0.67% for female…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing
