Addressing speaker gender bias in large scale speech translation systems
Shubham Bansal, Vikas Joshi, Harveen Chadha, Rupeshkumar Mehta, Jinyu, Li

TL;DR
This paper proposes a method to reduce speaker gender bias in speech translation systems by using large language models for correction and fine-tuning models to generate gender-specific translations directly from audio cues, significantly improving translation accuracy for female speakers.
Contribution
The study introduces a novel approach combining LLM-based correction and fine-tuning to address gender bias in speech translation without explicit gender input.
Findings
70% improvement in female speaker translation accuracy
Effective bias mitigation compared to baseline and existing systems
Applicable to scenarios with predefined or unknown speaker gender
Abstract
This study addresses the issue of speaker gender bias in Speech Translation (ST) systems, which can lead to offensive and inaccurate translations. The masculine bias often found in large-scale ST systems is typically perpetuated through training data derived from Machine Translation (MT) systems. Our approach involves two key steps. First, we employ Large Language Models (LLMs) to rectify translations based on the speaker's gender in a cost-effective manner. Second, we fine-tune the ST model with the corrected data, enabling the model to generate gender-specific translations directly from audio cues, without the need for explicit gender input. Additionally, we propose a three-mode fine-tuned model for scenarios where the speaker's gender is either predefined or should not be inferred from speech cues. We demonstrate a 70% improvement in translations for female speakers compared to our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Speech and dialogue systems
