BEADs: Bias Evaluation Across Domains

Shaina Raza; Mizanur Rahman; Michael R. Zhang

arXiv:2406.04220·cs.CL·February 20, 2026·1 cites

BEADs: Bias Evaluation Across Domains

Shaina Raza, Mizanur Rahman, Michael R. Zhang

PDF

Open Access 2 Datasets

TL;DR

This paper introduces BEADs, a comprehensive dataset for evaluating biases across diverse NLP tasks, revealing persistent biases and inconsistencies in current large language models.

Contribution

The work presents a new broad-coverage bias evaluation dataset with a standardized annotation scheme for multiple NLP tasks, enabling systematic bias assessment.

Findings

01

Models show demographic biases in specific tasks

02

Safety guardrails vary across groups and models

03

Persistent biases highlight need for better evaluation methods

Abstract

Recent advances in large language models (LLMs) have substantially improved natural language processing (NLP) applications. However, these models often inherit and amplify biases present in their training data. Although several datasets exist for bias detection, most are limited to one or two NLP tasks, typically classification or evaluation and do not provide broad coverage across diverse task settings. To address this gap, we introduce the \textbf{Bias Evaluations Across Domains} (\textbf{B}\texttt{EADs}) dataset, designed to support a wide range of NLP tasks, including text classification, token classification, bias quantification, and benign language generation. A key contribution of this work is a gold-standard annotation scheme that supports both evaluation and supervised training of language models. Experiments on state-of-the-art models reveal some gaps: some models exhibit…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Text Readability and Simplification

MethodsFocus