Do LLM Decoders Listen Fairly? Benchmarking How Language Model Priors Shape Bias in Speech Recognition

Srishti Ginjala; Eric Fosler-Lussier; Christopher W. Myers; Srinivasan Parthasarathy

arXiv:2604.21276·cs.CL·April 24, 2026

Do LLM Decoders Listen Fairly? Benchmarking How Language Model Priors Shape Bias in Speech Recognition

Srishti Ginjala, Eric Fosler-Lussier, Christopher W. Myers, Srinivasan Parthasarathy

PDF

TL;DR

This study benchmarks how different large language model decoders influence bias and fairness in speech recognition across various demographic groups and under diverse audio conditions.

Contribution

It provides a comprehensive evaluation of LLM decoders' impact on fairness and robustness in speech recognition, highlighting the importance of audio encoder design over LLM scale.

Findings

01

LLM decoders do not necessarily increase racial bias.

02

Severe audio degradation reduces fairness gaps but can amplify specific biases.

03

Audio encoder design significantly affects recognition fairness and robustness.

Abstract

As pretrained large language models replace task-specific decoders in speech recognition, a critical question arises: do their text-derived priors make recognition fairer or more biased across demographic groups? We evaluate nine models spanning three architectural generations (CTC with no language model, encoder-decoder with an implicit LM, and LLM-based with an explicit pretrained decoder) on about 43,000 utterances across five demographic axes (ethnicity, accent, gender, age, first language) using Common Voice 24 and Meta's Fair-Speech, a controlled-prompt dataset that eliminates vocabulary confounds. On clean audio, three findings challenge assumptions: LLM decoders do not amplify racial bias (Granite-8B has the best ethnicity fairness, max/min WER = 2.28); Whisper exhibits pathological hallucination on Indian-accented speech with a non-monotonic insertion-rate spike to 9.62% at…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.