How To Build Competitive Multi-gender Speech Translation Models For   Controlling Speaker Gender Translation

Marco Gaido; Dennis Fucci; Matteo Negri; Luisa Bentivogli

arXiv:2310.15114·cs.CL·October 24, 2023·1 cites

How To Build Competitive Multi-gender Speech Translation Models For Controlling Speaker Gender Translation

Marco Gaido, Dennis Fucci, Matteo Negri, Luisa Bentivogli

PDF

Open Access 1 Repo

TL;DR

This paper proposes a multi-gender speech translation model that incorporates speaker gender metadata, outperforming gender-specific models in accuracy and offering a more maintainable solution for gender-aware translation.

Contribution

It introduces a single multi-gender speech translation model that effectively integrates speaker gender metadata, eliminating the need for separate models.

Findings

01

Multi-gender model outperforms gender-specific models in accuracy.

02

Incorporating speaker metadata improves gender assignment accuracy.

03

Fine-tuning from existing models is less effective than training from scratch.

Abstract

When translating from notional gender languages (e.g., English) into grammatical gender languages (e.g., Italian), the generated translation requires explicit gender assignments for various words, including those referring to the speaker. When the source sentence does not convey the speaker's gender, speech translation (ST) models either rely on the possibly-misleading vocal traits of the speaker or default to the masculine gender, the most frequent in existing training corpora. To avoid such biased and not inclusive behaviors, the gender assignment of speaker-related expressions should be guided by externally-provided metadata about the speaker's gender. While previous work has shown that the most effective solution is represented by separate, dedicated gender-specific models, the goal of this paper is to achieve the same results by integrating the speaker's gender metadata into a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hlt-mt/fbk-fairseq
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis · Speech and dialogue systems