Building Bridges: A Dataset for Evaluating Gender-Fair Machine Translation into German
Manuel Lardelli, Giuseppe Attanasio, Anne Lauscher

TL;DR
This paper introduces a new dataset and benchmark for evaluating gender-fair translation from English to German, revealing that current systems predominantly produce masculine forms and rarely use gender-neutral language.
Contribution
The study provides the first benchmark dataset for gender-fair English-German machine translation, including resources, test instances, and an evaluation of multiple models and systems.
Findings
Most MT systems produce masculine forms predominantly.
Gender-neutral variants are rarely generated by current models.
The dataset highlights the need for improved gender-fair translation methods.
Abstract
The translation of gender-neutral person-referring terms (e.g., the students) is often non-trivial. Translating from English into German poses an interesting case -- in German, person-referring nouns are usually gender-specific, and if the gender of the referent(s) is unknown or diverse, the generic masculine (die Studenten (m.)) is commonly used. This solution, however, reduces the visibility of other genders, such as women and non-binary people. To counteract gender discrimination, a societal movement towards using gender-fair language exists (e.g., by adopting neosystems). However, gender-fair German is currently barely supported in machine translation (MT), requiring post-editing or manual translations. We address this research gap by studying gender-fair language in English-to-German MT. Concretely, we enrich a community-created gender-fair language dictionary and sample…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsGender Studies in Language · Natural Language Processing Techniques · Text Readability and Simplification
