How to Measure Gender Bias in Machine Translation: Optimal Translators,   Multiple Reference Points

Anna Farkas; Ren\'ata N\'emeth

arXiv:2011.06445·stat.ML·December 21, 2021·5 cites

How to Measure Gender Bias in Machine Translation: Optimal Translators, Multiple Reference Points

Anna Farkas, Ren\'ata N\'emeth

PDF

Open Access

TL;DR

This study systematically measures gender bias in Google Translate by comparing translations to an optimal non-biased translator, revealing prevalent bias against women and the influence of occupation-related words.

Contribution

It introduces a fair measure for gender bias in machine translation using multiple reference points and analyzes the impact of occupation and adjectives on bias.

Findings

01

Bias against women is more frequent in translations.

02

Translations align more with perception than occupational statistics.

03

Occupation words influence bias more than adjectives.

Abstract

In this paper, as a case study, we present a systematic study of gender bias in machine translation with Google Translate. We translated sentences containing names of occupations from Hungarian, a language with gender-neutral pronouns, into English. Our aim was to present a fair measure for bias by comparing the translations to an optimal non-biased translator. When assessing bias, we used the following reference points: (1) the distribution of men and women among occupations in both the source and the target language countries, as well as (2) the results of a Hungarian survey that examined if certain jobs are generally perceived as feminine or masculine. We also studied how expanding sentences with adjectives referring to occupations effect the gender of the translated pronouns. As a result, we found bias against both genders, but biased results against women are much more frequent.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Text Readability and Simplification · Gender Studies in Language