TL;DR
This paper introduces AmbGIMT, a benchmark for evaluating gender bias in machine translation beyond binary genders, using ambiguous attitude words and an Emotional Attitude Score, revealing significant biases in current models.
Contribution
The study presents a novel benchmark and evaluation method for non-binary gender bias in machine translation, expanding beyond traditional binary-focused assessments.
Findings
Non-binary gender translation quality is lower and more biased.
Prompt constraints can reduce but not eliminate bias.
Models exhibit more negative attitudes in non-binary contexts.
Abstract
Gender bias has been a focal point in the study of bias in machine translation and language models. Existing machine translation gender bias evaluations are primarily focused on male and female genders, limiting the scope of the evaluation. To assess gender bias accurately, these studies often rely on calculating the accuracy of gender pronouns or the masculine and feminine attributes of grammatical gender via the stereotypes triggered by occupations or sentiment words ({\em i.e.}, clear positive or negative attitude), which cannot extend to non-binary groups. This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words), which assesses gender bias beyond binary gender. Meanwhile, we propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words. In evaluating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
