Blind Men and the Elephant: Diverse Perspectives on Gender Stereotypes in Benchmark Datasets
Mahdi Zakizadeh, Mohammad Taher Pilehvar

TL;DR
This paper highlights the complexity of measuring gender stereotypes in language models, showing that current benchmarks are incomplete and proposing data balancing techniques to improve bias detection accuracy.
Contribution
It reveals the limitations of existing benchmarks and introduces a framework for balancing data to better capture gender stereotypes in language models.
Findings
Balancing data improves correlation between stereotype benchmarks.
Current benchmarks only capture partial facets of gender bias.
Simple balancing techniques can significantly enhance bias measurement.
Abstract
Accurately measuring gender stereotypical bias in language models is a complex task with many hidden aspects. Current benchmarks have underestimated this multifaceted challenge and failed to capture the full extent of the problem. This paper examines the inconsistencies between intrinsic stereotype benchmarks. We propose that currently available benchmarks each capture only partial facets of gender stereotypes, and when considered in isolation, they provide just a fragmented view of the broader landscape of bias in language models. Using StereoSet and CrowS-Pairs as case studies, we investigated how data distribution affects benchmark results. By applying a framework from social psychology to balance the data of these benchmarks across various components of gender stereotypes, we demonstrated that even simple balancing techniques can significantly improve the correlation between…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsGender Politics and Representation
