Everyone deserves their voice to be heard: Analyzing Predictive Gender   Bias in ASR Models Applied to Dutch Speech Data

Rik Raes; Saskia Lensink; Mykola Pechenizkiy

arXiv:2411.09431·cs.CL·November 15, 2024

Everyone deserves their voice to be heard: Analyzing Predictive Gender Bias in ASR Models Applied to Dutch Speech Data

Rik Raes, Saskia Lensink, Mykola Pechenizkiy

PDF

Open Access

TL;DR

This paper investigates gender-based biases in state-of-the-art ASR systems, specifically Whisper, when applied to Dutch speech data, revealing significant disparities in recognition accuracy across gender groups.

Contribution

It provides a detailed analysis of gender bias in Whisper ASR models on Dutch speech, using multiple metrics and fairness frameworks, highlighting disparities and their implications.

Findings

01

Substantial gender disparities in word error rate across model sizes

02

Biases are statistically significant across all tested models

03

Implications for fairness in automatic subtitling applications

Abstract

Recent research has shown that state-of-the-art (SotA) Automatic Speech Recognition (ASR) systems, such as Whisper, often exhibit predictive biases that disproportionately affect various demographic groups. This study focuses on identifying the performance disparities of Whisper models on Dutch speech data from the Common Voice dataset and the Dutch National Public Broadcasting organisation. We analyzed the word error rate, character error rate and a BERT-based semantic similarity across gender groups. We used the moral framework of Weerts et al. (2022) to assess quality of service harms and fairness, and to provide a nuanced discussion on the implications of these biases, particularly for automatic subtitling. Our findings reveal substantial disparities in word error rate (WER) among gender groups across all model sizes, with bias identified through statistical testing.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis

Methodstravel james