On Measuring and Mitigating Biased Inferences of Word Embeddings

Sunipa Dev; Tao Li; Jeff Phillips; Vivek Srikumar

arXiv:1908.09369·cs.CL·November 27, 2019

On Measuring and Mitigating Biased Inferences of Word Embeddings

Sunipa Dev, Tao Li, Jeff Phillips, Vivek Srikumar

PDF

2 Repos

TL;DR

This paper introduces methods to measure and reduce biases in word embeddings, improving the validity of inferences in NLP models, especially addressing gender bias in static and contextualized embeddings.

Contribution

It proposes a bias measurement mechanism using natural language inference and demonstrates bias mitigation techniques for static and contextualized embeddings.

Findings

01

Bias measurement reduces invalid inferences

02

Bias mitigation techniques are effective on GloVe embeddings

03

Selective application extends to ELMo and BERT

Abstract

Word embeddings carry stereotypical connotations from the text they are trained on, which can lead to invalid inferences in downstream models that rely on them. We use this observation to design a mechanism for measuring stereotypes using the task of natural language inference. We demonstrate a reduction in invalid inferences via bias mitigation strategies on static word embeddings (GloVe). Further, we show that for gender bias, these techniques extend to contextualized embeddings when applied selectively only to the static components of contextualized embeddings (ELMo, BERT).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.