TL;DR
This paper systematically investigates bias in automated speaker recognition, revealing significant disparities affecting women and non-US speakers across the development pipeline, and offers practical mitigation strategies.
Contribution
It provides the first comprehensive empirical and analytical study of bias sources in speaker verification, highlighting stages where bias manifests and proposing mitigation approaches.
Findings
Bias exists at all development stages of speaker recognition.
Female and non-US speakers experience significant performance degradation.
Practical recommendations for bias mitigation are outlined.
Abstract
Automated speaker recognition uses data processing to identify speakers by their voice. Today, automated speaker recognition is deployed on billions of smart devices and in services such as call centres. Despite their wide-scale deployment and known sources of bias in related domains like face recognition and natural language processing, bias in automated speaker recognition has not been studied systematically. We present an in-depth empirical and analytical study of bias in the machine learning development workflow of speaker verification, a voice biometric and core task in automated speaker recognition. Drawing on an established framework for understanding sources of harm in machine learning, we show that bias exists at every development stage in the well-known VoxCeleb Speaker Recognition Challenge, including data generation, model building, and implementation. Most affected are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
