Datasets for Fairness in Language Models: An In-Depth Survey

Jiale Zhang; Zichong Wang; Avash Palikhe; Zhipeng Yin; Wenbin Zhang

arXiv:2506.23411·cs.CL·September 23, 2025

Datasets for Fairness in Language Models: An In-Depth Survey

Jiale Zhang, Zichong Wang, Avash Palikhe, Zhipeng Yin, Wenbin Zhang

PDF

1 Repo

TL;DR

This survey critically examines fairness datasets for language models, analyzing their limitations and proposing a unified evaluation framework to improve fairness assessment and guide future benchmark development.

Contribution

It provides a comprehensive analysis of existing fairness datasets, introduces a unified evaluation framework, and offers insights to improve fairness benchmarking in language models.

Findings

01

Identified biases and limitations in current fairness datasets

02

Proposed a unified framework for evaluating demographic disparities

03

Highlighted the need for broader social context in future benchmarks

Abstract

Despite the growing reliance on fairness benchmarks to evaluate language models, the datasets that underpin these benchmarks remain critically underexamined. This survey addresses that overlooked foundation by offering a comprehensive analysis of the most widely used fairness datasets in language model research. To ground this analysis, we characterize each dataset across key dimensions, including provenance, demographic scope, annotation design, and intended use, revealing the assumptions and limitations baked into current evaluation practices. Building on this foundation, we propose a unified evaluation framework that surfaces consistent patterns of demographic disparities across benchmarks and scoring metrics. Applying this framework to sixteen popular datasets, we uncover overlooked biases that may distort conclusions about model fairness and offer guidance on selecting, combining,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

vanbantruong/fairness-in-large-language-models
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.