Gender Disambiguation in Machine Translation: Diagnostic Evaluation in Decoder-Only Architectures
Chiara Manna, Hosein Mohebbi, Afra Alishahi, Fr\'ed\'eric Blain, Eva Vanmassenhove

TL;DR
This paper evaluates gender bias in decoder-only machine translation models, introducing a new bias measure and showing that post-training techniques can reduce default gender assumptions and improve bias awareness.
Contribution
It introduces 'Prior Bias', a novel measure for default gender assumptions, and applies it to decoder-only models, revealing biases and effects of post-training.
Findings
Decoder-only models do not outperform encoder-decoder models on gender metrics.
Post-training reduces masculine Prior Bias.
Instruction tuning improves contextual gender awareness.
Abstract
While Large Language Models achieve state-of-the-art results across a wide range of NLP tasks, they remain prone to systematic biases. Among these, gender bias is particularly salient in MT, due to systematic differences across languages in whether and how gender is marked. As a result, translation often requires disambiguating implicit source signals into explicit gender-marked forms. In this context, standard benchmarks may capture broad disparities but fail to reflect the full complexity of gender bias in modern MT. In this paper, we extend recent frameworks on bias evaluation by: (i) introducing a novel measure coined "Prior Bias", capturing a model's default gender assumptions, and (ii) applying the framework to decoder-only MT models. Our results show that, despite their scale and state-of-the-art status, decoder-only models do not generally outperform encoder-decoder…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Adversarial Robustness in Machine Learning
