Addressing speaker gender bias in large scale speech translation systems

Shubham Bansal; Vikas Joshi; Harveen Chadha; Rupeshkumar Mehta; Jinyu; Li

arXiv:2501.05989·cs.CL·January 13, 2025

Addressing speaker gender bias in large scale speech translation systems

Shubham Bansal, Vikas Joshi, Harveen Chadha, Rupeshkumar Mehta, Jinyu, Li

PDF

Open Access

TL;DR

This paper proposes a method to reduce speaker gender bias in speech translation systems by using large language models for correction and fine-tuning models to generate gender-specific translations directly from audio cues, significantly improving translation accuracy for female speakers.

Contribution

The study introduces a novel approach combining LLM-based correction and fine-tuning to address gender bias in speech translation without explicit gender input.

Findings

01

70% improvement in female speaker translation accuracy

02

Effective bias mitigation compared to baseline and existing systems

03

Applicable to scenarios with predefined or unknown speaker gender

Abstract

This study addresses the issue of speaker gender bias in Speech Translation (ST) systems, which can lead to offensive and inaccurate translations. The masculine bias often found in large-scale ST systems is typically perpetuated through training data derived from Machine Translation (MT) systems. Our approach involves two key steps. First, we employ Large Language Models (LLMs) to rectify translations based on the speaker's gender in a cost-effective manner. Second, we fine-tune the ST model with the corrected data, enabling the model to generate gender-specific translations directly from audio cues, without the need for explicit gender input. Additionally, we propose a three-mode fine-tuned model for scenarios where the speaker's gender is either predefined or should not be inferred from speech cues. We demonstrate a 70% improvement in translations for female speakers compared to our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Speech and dialogue systems