DisaBench: A Participatory Evaluation Framework for Disability Harms in Language Models

Eugenia Kim; Ioana Tanase; Christina Mallon

arXiv:2605.12702·cs.AI·May 14, 2026

DisaBench: A Participatory Evaluation Framework for Disability Harms in Language Models

Eugenia Kim, Ioana Tanase, Christina Mallon

PDF

TL;DR

DisaBench is a comprehensive evaluation framework for assessing disability-related harms in language models, emphasizing nuanced, context-aware analysis with community involvement.

Contribution

It introduces a new taxonomy, dataset, and methodology co-created with disabled communities to better evaluate subtle and intersectional harms in language models.

Findings

01

Harm rates vary significantly by disability type.

02

Terminology-driven harm is culturally and temporally bound.

03

Standard safety tests miss subtle harms detectable by domain experts.

Abstract

General-purpose safety benchmarks for large language models do not adequately evaluate disability-related harms. We introduce DisaBench: a taxonomy of twelve disability harm categories co-created with people with disabilities and red teaming experts, a taxonomy-driven evaluation methodology that pairs benign and adversarial prompts across seven life domains, and a dataset of 175 prompts with human-annotated labels on 525 prompt-response pairs. Annotation by four evaluators with lived disability experience reveals three findings: harm rates vary sharply by disability type and will compound in non-text modalities, terminology-driven harm is culturally and temporally bound rather than universally assessable, and standard safety evaluation catches overt failures while missing the subtle harms that only domain expertise can recognize. Disability harm is simultaneously personal,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.