Gender Bias in MT for a Genderless Language: New Benchmarks for Basque
Amaia Murillo, Olatz-Perez-de-Vi\~naspre, Naiara Perez

TL;DR
This paper introduces new benchmarks for evaluating gender bias in machine translation involving Basque, a genderless language, revealing persistent bias and varying translation quality across models.
Contribution
It presents two novel datasets tailored for Basque to assess gender bias in MT, addressing the lack of culturally and linguistically appropriate evaluation resources.
Findings
Models favor masculine forms in translation.
Slightly higher quality for masculine referents.
Gender bias remains prevalent in current MT systems.
Abstract
Large language models (LLMs) and machine translation (MT) systems are increasingly used in our daily lives, but their outputs can reproduce gender bias present in the training data. Most resources for evaluating such biases are designed for English and reflect its sociocultural context, which limits their applicability to other languages. This work addresses this gap by introducing two new datasets to evaluate gender bias in translations involving Basque, a low-resource and genderless language. WinoMTeus adapts the WinoMT benchmark to examine how gender-neutral Basque occupations are translated into gendered languages such as Spanish and French. FLORES+Gender, in turn, extends the FLORES+ benchmark to assess whether translation quality varies when translating from gendered languages (Spanish and English) into Basque depending on the gender of the referent. We evaluate several…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
