TL;DR
This paper introduces new methods to detect and measure social and intersectional biases in neural language models, revealing that biases at the intersection of race and gender are particularly strong.
Contribution
It presents the CEAT method for bias measurement without templates and develops IBD and EIBD algorithms for automatic detection of intersectional biases in contextualized and static embeddings.
Findings
CEAT effectively measures overall bias magnitude across contexts.
Intersectional biases are strongest for groups like African American females.
Biases at the intersection of race and gender show high effect magnitudes.
Abstract
With the starting point that implicit human biases are reflected in the statistical regularities of language, it is possible to measure biases in English static word embeddings. State-of-the-art neural language models generate dynamic word embeddings dependent on the context in which the word appears. Current methods measure pre-defined social and intersectional biases that appear in particular contexts defined by sentence templates. Dispensing with templates, we introduce the Contextualized Embedding Association Test (CEAT), that can summarize the magnitude of overall bias in neural language models by incorporating a random-effects model. Experiments on social and intersectional biases show that CEAT finds evidence of all tested biases and provides comprehensive information on the variance of effect magnitudes of the same bias in different contexts. All the models trained on English…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methods7 Fastest Ways to Call American Airlines Reservations Number (USA Guide)
