Detecting Emergent Intersectional Biases: Contextualized Word Embeddings   Contain a Distribution of Human-like Biases

Wei Guo; Aylin Caliskan

arXiv:2006.03955·cs.CY·May 20, 2021

Detecting Emergent Intersectional Biases: Contextualized Word Embeddings Contain a Distribution of Human-like Biases

Wei Guo, Aylin Caliskan

PDF

2 Repos

TL;DR

This paper introduces new methods to detect and measure social and intersectional biases in neural language models, revealing that biases at the intersection of race and gender are particularly strong.

Contribution

It presents the CEAT method for bias measurement without templates and develops IBD and EIBD algorithms for automatic detection of intersectional biases in contextualized and static embeddings.

Findings

01

CEAT effectively measures overall bias magnitude across contexts.

02

Intersectional biases are strongest for groups like African American females.

03

Biases at the intersection of race and gender show high effect magnitudes.

Abstract

With the starting point that implicit human biases are reflected in the statistical regularities of language, it is possible to measure biases in English static word embeddings. State-of-the-art neural language models generate dynamic word embeddings dependent on the context in which the word appears. Current methods measure pre-defined social and intersectional biases that appear in particular contexts defined by sentence templates. Dispensing with templates, we introduce the Contextualized Embedding Association Test (CEAT), that can summarize the magnitude of overall bias in neural language models by incorporating a random-effects model. Experiments on social and intersectional biases show that CEAT finds evidence of all tested biases and provides comprehensive information on the variance of effect magnitudes of the same bias in different contexts. All the models trained on English…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Methods7 Fastest Ways to Call American Airlines Reservations Number (USA Guide)