Mitigating Gender Bias in Contextual Word Embeddings
Navya Yarrabelly, Vinay Damodaran, Feng-Guang Su

TL;DR
This paper introduces a new method for reducing gender bias in contextual word embeddings through a novel training objective, while maintaining task performance, and proposes new bias evaluation metrics aligned with normative reasoning.
Contribution
It presents a novel objective function for masked-language modeling that mitigates gender bias in contextual embeddings and introduces new bias measurement metrics.
Findings
The proposed method effectively reduces gender bias in embeddings.
Bias in static embeddings mainly originates from stereotypical names.
The approach preserves downstream task performance.
Abstract
Word embeddings have been shown to produce remarkable results in tackling a vast majority of NLP related tasks. Unfortunately, word embeddings also capture the stereotypical biases that are prevalent in society, affecting the predictive performance of the embeddings when used in downstream tasks. While various techniques have been proposed \cite{bolukbasi2016man, zhao2018learning} and criticized\cite{gonen2019lipstick} for static embeddings, very little work has focused on mitigating bias in contextual embeddings. In this paper, we propose a novel objective function for MLM(Masked-Language Modeling) which largely mitigates the gender bias in contextual embeddings and also preserves the performance for downstream tasks. Since previous works on measuring bias in contextual embeddings lack in normative reasoning, we also propose novel evaluation metrics that are straight-forward and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText Readability and Simplification · Hate Speech and Cyberbullying Detection · Topic Modeling
