Mitigating Gender Stereotypes in Hindi and Marathi

Neeraja Kirtane; Tanvi Anand

arXiv:2205.05901·cs.CL·May 13, 2022·1 cites

Mitigating Gender Stereotypes in Hindi and Marathi

Neeraja Kirtane, Tanvi Anand

PDF

Open Access

TL;DR

This paper evaluates gender stereotypes in Hindi and Marathi NLP systems, creating a dataset and applying debiasing techniques to reduce bias in word embeddings, addressing a gap in Indic language NLP research.

Contribution

It introduces a novel dataset and bias measurement methods tailored for gendered Indic languages and proposes effective debiasing techniques for Hindi and Marathi.

Findings

01

Bias reduction achieved in embeddings after applying proposed techniques

02

Gender bias significantly present in occupation and emotion words

03

Debiasing methods outperform baseline in bias metrics

Abstract

As the use of natural language processing increases in our day-to-day life, the need to address gender bias inherent in these systems also amplifies. This is because the inherent bias interferes with the semantic structure of the output of these systems while performing tasks like machine translation. While research is being done in English to quantify and mitigate bias, debiasing methods in Indic Languages are either relatively nascent or absent for some Indic languages altogether. Most Indic languages are gendered, i.e., each noun is assigned a gender according to each language's grammar rules. As a consequence, evaluation differs from what is done in English. This paper evaluates the gender stereotypes in Hindi and Marathi languages. The methodologies will differ from the ones in the English language because there are masculine and feminine counterparts in the case of some words. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Natural Language Processing Techniques · Text Readability and Simplification