Adversarial Reweighting for Speaker Verification Fairness

Minho Jin; Chelsea J.-T. Ju; Zeya Chen; Yi-Chieh Liu; Jasha Droppo,; and Andreas Stolcke

arXiv:2207.07776·eess.AS·February 9, 2024

Adversarial Reweighting for Speaker Verification Fairness

Minho Jin, Chelsea J.-T. Ju, Zeya Chen, Yi-Chieh Liu, Jasha Droppo,, and Andreas Stolcke

PDF

Open Access

TL;DR

This paper introduces an adversarial reweighting approach for speaker verification that enhances fairness across gender and nationality groups without needing subgroup labels during training.

Contribution

It reformulates ARW for metric learning in speaker verification and demonstrates its effectiveness in reducing subgroup performance disparities.

Findings

01

Achieved 1.08% overall EER on VoxCeleb

02

Reduced gender EER gap from 0.70% to 0.58%

03

Lowered EER standard deviation across nationalities

Abstract

We address performance fairness for speaker verification using the adversarial reweighting (ARW) method. ARW is reformulated for speaker verification with metric learning, and shown to improve results across different subgroups of gender and nationality, without requiring annotation of subgroups in the training data. An adversarial network learns a weight for each training sample in the batch so that the main learner is forced to focus on poorly performing instances. Using a min-max optimization algorithm, this method improves overall speaker verification fairness. We present three different ARWformulations: accumulated pairwise similarity, pseudo-labeling, and pairwise weighting, and measure their performance in terms of equal error rate (EER) on the VoxCeleb corpus. Results show that the pairwise weighting method can achieve 1.08% overall EER, 1.25% for male and 0.67% for female…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing