Extending Challenge Sets to Uncover Gender Bias in Machine Translation: Impact of Stereotypical Verbs and Adjectives
Jonas-Dario Troles, Ute Schmid

TL;DR
This paper extends an existing challenge set to measure gender bias in machine translation by including stereotypical adjectives and verbs, revealing biases across multiple commercial MT systems.
Contribution
It introduces WiBeMT, an expanded challenge set with over 70,000 sentences, to better assess gender bias in MT systems beyond occupation translation.
Findings
All three MT systems exhibit gender bias.
Adjectives significantly influence gender bias.
Verbs have a lesser but notable impact.
Abstract
Human gender bias is reflected in language and text production. Because state-of-the-art machine translation (MT) systems are trained on large corpora of text, mostly generated by humans, gender bias can also be found in MT. For instance when occupations are translated from a language like English, which mostly uses gender neutral words, to a language like German, which mostly uses a feminine and a masculine version for an occupation, a decision must be made by the MT System. Recent research showed that MT systems are biased towards stereotypical translation of occupations. In 2019 the first, and so far only, challenge set, explicitly designed to measure the extent of gender bias in MT systems has been published. In this set measurement of gender bias is solely based on the translation of occupations. In this paper we present an extension of this challenge set, called WiBeMT, with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Natural Language Processing Techniques · Text Readability and Simplification
