CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models
Nikita Nangia, Clara Vania, Rasika Bhalerao, Samuel R. Bowman

TL;DR
CrowS-Pairs is a benchmark dataset designed to measure social biases in masked language models by comparing stereotypical and less stereotypical sentences across multiple bias categories, revealing prevalent biases in current models.
Contribution
Introduces CrowS-Pairs, a new dataset with 1508 examples to evaluate social biases in masked language models across nine bias types, focusing on disadvantaged groups.
Findings
All evaluated MLMs favor stereotypical sentences across categories.
CrowS-Pairs can serve as a benchmark for bias evaluation.
The dataset highlights the pervasiveness of social biases in language models.
Abstract
Pretrained language models, especially masked language models (MLMs) have seen success across many NLP tasks. However, there is ample evidence that they use the cultural biases that are undoubtedly present in the corpora they are trained on, implicitly creating harm with biased representations. To measure some forms of social bias in language models against protected demographic groups in the US, we introduce the Crowdsourced Stereotype Pairs benchmark (CrowS-Pairs). CrowS-Pairs has 1508 examples that cover stereotypes dealing with nine types of bias, like race, religion, and age. In CrowS-Pairs a model is presented with two sentences: one that is more stereotyping and another that is less stereotyping. The data focuses on stereotypes about historically disadvantaged groups and contrasts them with advantaged groups. We find that all three of the widely-used MLMs we evaluate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗aieng-lab/bert-base-cased-gradiend-gender-debiasedmodel
- 🤗aieng-lab/bert-large-cased-gradiend-gender-debiasedmodel· 6 dl6 dl
- 🤗aieng-lab/distilbert-base-cased-gradiend-gender-debiasedmodel· 6 dl6 dl
- 🤗aieng-lab/roberta-large-gradiend-gender-debiasedmodel· 4 dl4 dl
- 🤗aieng-lab/gpt2-gradiend-gender-debiasedmodel· 3 dl3 dl
- 🤗aieng-lab/Llama-3.2-3B-gradiend-gender-debiasedmodel· 5 dl5 dl
- 🤗aieng-lab/Llama-3.2-3B-Instruct-gradiend-gender-debiasedmodel· 5 dl5 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
