Evaluating Gender Bias in Hindi-English Machine Translation

Gauri Gupta; Krithika Ramesh; Sanjay Singh

arXiv:2106.08680·cs.CL·June 17, 2021

Evaluating Gender Bias in Hindi-English Machine Translation

Gauri Gupta, Krithika Ramesh, Sanjay Singh

PDF

Open Access

TL;DR

This paper assesses gender bias in Hindi-English machine translation, adapting bias measurement metrics to account for Hindi's grammatical gender variations and comparing biases across different embedding models.

Contribution

It introduces a Hindi-specific modification of the TGBI bias metric and evaluates bias in translation systems, addressing a gap in bias measurement for Indic languages.

Findings

01

Bias varies significantly across different embedding models

02

Modified TGBI effectively captures gender bias in Hindi context

03

Comparison reveals differences between pre-trained and translation model biases

Abstract

With language models being deployed increasingly in the real world, it is essential to address the issue of the fairness of their outputs. The word embedding representations of these language models often implicitly draw unwanted associations that form a social bias within the model. The nature of gendered languages like Hindi, poses an additional problem to the quantification and mitigation of bias, owing to the change in the form of the words in the sentence, based on the gender of the subject. Additionally, there is sparse work done in the realm of measuring and debiasing systems for Indic languages. In our work, we attempt to evaluate and quantify the gender bias within a Hindi-English machine translation system. We implement a modified version of the existing TGBI metric based on the grammatical considerations for Hindi. We also compare and contrast the resulting bias measurements…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Hate Speech and Cyberbullying Detection