Voice, Bias, and Coreference: An Interpretability Study of Gender in Speech Translation

Lina Conti; Dennis Fucci; Marco Gaido; Matteo Negri; Guillaume Wisniewski; Luisa Bentivogli

arXiv:2511.21517·cs.CL·April 29, 2026

Voice, Bias, and Coreference: An Interpretability Study of Gender in Speech Translation

Lina Conti, Dennis Fucci, Marco Gaido, Matteo Negri, Guillaume Wisniewski, Luisa Bentivogli

PDF

TL;DR

This study explores how speech translation models interpret speaker gender, revealing that models use acoustic cues and pronouns to determine gender, which impacts bias and misgendering.

Contribution

It uncovers the mechanisms behind gender assignment in speech translation, highlighting the role of acoustic features and pronouns in model decision-making.

Findings

01

Models learn broader masculine prevalence patterns beyond training data.

02

Acoustic input can override ILM biases towards masculinity.

03

Pronouns help models link gendered terms to speakers using spectral information.

Abstract

Unlike text, speech conveys information about the speaker, such as gender, through acoustic cues like pitch. This gives rise to modality-specific bias concerns. For example, in speech translation (ST), when translating from languages with notional gender, such as English, into languages where gender-ambiguous terms referring to the speaker are assigned grammatical gender, the speaker's vocal characteristics may play a role in gender assignment. This risks misgendering speakers, whether through masculine defaults or vocal-based assumptions. Yet, how ST models make these decisions remains poorly understood. We investigate the mechanisms ST models use to assign gender to speaker-referring terms across three language pairs (en-es/fr/it). To do so, we examine how training data patterns, internal language model (ILM) biases, and acoustic information interact. We find that models do not simply…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.