Different Speech Translation Models Encode and Translate Speaker Gender Differently

Dennis Fucci; Marco Gaido; Matteo Negri; Luisa Bentivogli; Andre Martins; Giuseppe Attanasio

arXiv:2506.02172·cs.CL·June 4, 2025

Different Speech Translation Models Encode and Translate Speaker Gender Differently

Dennis Fucci, Marco Gaido, Matteo Negri, Luisa Bentivogli, Andre Martins, Giuseppe Attanasio

PDF

Open Access 1 Video

TL;DR

This paper investigates how different speech translation models encode speaker gender, revealing that newer models tend to encode less gender information and exhibit a masculine bias in translation.

Contribution

It demonstrates that newer speech translation architectures encode less gender information and are more prone to masculine bias compared to traditional models.

Findings

01

Traditional models encode speaker gender effectively.

02

Newer models with adapters encode less gender information.

03

Bias towards masculine translation is more pronounced in newer architectures.

Abstract

Recent studies on interpreting the hidden states of speech models have shown their ability to capture speaker-specific features, including gender. Does this finding also hold for speech translation (ST) models? If so, what are the implications for the speaker's gender assignment in translation? We address these questions from an interpretability perspective, using probing methods to assess gender encoding across diverse ST models. Results on three language directions (English-French/Italian/Spanish) indicate that while traditional encoder-decoder models capture gender information, newer architectures -- integrating a speech encoder with a machine translation system via adapters -- do not. We also demonstrate that low gender encoding capabilities result in systems' tendency toward a masculine default, a translation bias that is more pronounced in newer architectures.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Different Speech Translation Models Encode and Translate Speaker Gender Differently· underline

Taxonomy

TopicsNatural Language Processing Techniques