Trustworthy Social Bias Measurement

Rishi Bommasani; Percy Liang

arXiv:2212.11672·cs.CL·July 19, 2024·5 cites

Trustworthy Social Bias Measurement

Rishi Bommasani, Percy Liang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new framework for measuring social bias in NLP that is grounded in social science principles, rigorously validated, and designed to be trustworthy and reliable.

Contribution

It proposes DivDist, a general bias measurement framework with five concrete measures, validated through a comprehensive testing protocol.

Findings

01

Measures predict biases in US employment.

02

Overcome deficiencies of prior bias measures.

03

Demonstrate conceptual, technical, and empirical robustness.

Abstract

How do we design measures of social bias that we trust? While prior work has introduced several measures, no measure has gained widespread trust: instead, mounting evidence argues we should distrust these measures. In this work, we design bias measures that warrant trust based on the cross-disciplinary theory of measurement modeling. To combat the frequently fuzzy treatment of social bias in NLP, we explicitly define social bias, grounded in principles drawn from social science research. We operationalize our definition by proposing a general bias measurement framework DivDist, which we use to instantiate 5 concrete bias measures. To validate our measures, we propose a rigorous testing protocol with 8 testing criteria (e.g. predictive validity: do measures predict biases in US employment?). Through our testing, we demonstrate considerable evidence to trust our measures, showing they…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rishibommasani/biasmeasures
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Experimental Behavioral Economics Studies · Social Media and Politics