Dutch CrowS-Pairs: Adapting a Challenge Dataset for Measuring Social Biases in Language Models for Dutch

Elza Strazda; Gerasimos Spanakis

arXiv:2507.16442·cs.CL·July 23, 2025

Dutch CrowS-Pairs: Adapting a Challenge Dataset for Measuring Social Biases in Language Models for Dutch

Elza Strazda, Gerasimos Spanakis

PDF

Open Access

TL;DR

This paper introduces a Dutch version of the CrowS-Pairs dataset to measure social biases in Dutch language models, revealing significant bias variability across languages and contexts.

Contribution

It adapts the CrowS-Pairs bias measurement dataset for Dutch, enabling bias evaluation in Dutch language models for the first time.

Findings

01

Dutch models show less bias than English models.

02

Bias varies significantly across languages and contexts.

03

Assigning personas to models influences bias levels.

Abstract

Warning: This paper contains explicit statements of offensive stereotypes which might be upsetting. Language models are prone to exhibiting biases, further amplifying unfair and harmful stereotypes. Given the fast-growing popularity and wide application of these models, it is necessary to ensure safe and fair language models. As of recent considerable attention has been paid to measuring bias in language models, yet the majority of studies have focused only on English language. A Dutch version of the US-specific CrowS-Pairs dataset for measuring bias in Dutch language models is introduced. The resulting dataset consists of 1463 sentence pairs that cover bias in 9 categories, such as Sexual orientation, Gender and Disability. The sentence pairs are composed of contrasting sentences, where one of the sentences concerns disadvantaged groups and the other advantaged groups. Using the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputational and Text Analysis Methods