DENOASR: Debiasing ASRs through Selective Denoising
Anand Kumar Rai, Siddharth D Jaiswal, Shubham Prakash, Bendi Pragnya, Sree, Animesh Mukherjee

TL;DR
This paper introduces DENOASR, a selective denoising framework that reduces gender bias in ASR systems by combining denoising techniques, improving fairness without sacrificing overall accuracy across multiple datasets and models.
Contribution
The paper proposes a novel selective denoising method, combining DEMUCS and LE, to mitigate gender bias in ASR systems without degrading overall performance.
Findings
Significant reduction in gender-based word error rate gap.
Effective use of combined denoising techniques on state-of-the-art ASRs.
Selective denoising based on speech intelligibility improves fairness.
Abstract
Automatic Speech Recognition (ASR) systems have been examined and shown to exhibit biases toward particular groups of individuals, influenced by factors such as demographic traits, accents, and speech styles. Noise can disproportionately impact speakers with certain accents, dialects, or speaking styles, leading to biased error rates. In this work, we introduce a novel framework DENOASR, which is a selective denoising technique to reduce the disparity in the word error rates between the two gender groups, male and female. We find that a combination of two popular speech denoising techniques, viz. DEMUCS and LE, can be effectively used to mitigate ASR disparity without compromising their overall performance. Experiments using two state-of-the-art open-source ASRs - OpenAI WHISPER and NVIDIA NEMO - on multiple benchmark datasets, including TIE, VOX-POPULI, TEDLIUM, and FLEURS, show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsContext-Aware Activity Recognition Systems
