Social Bias Frames: Reasoning about Social and Power Implications of Language
Maarten Sap, Saadia Gabriel, Lianhui Qin, Dan Jurafsky, Noah A. Smith,, Yejin Choi

TL;DR
This paper introduces Social Bias Frames, a formalism for modeling social biases in language, supported by a large annotated corpus, and evaluates neural models' ability to detect and explain social biases in text.
Contribution
It proposes Social Bias Frames as a new formalism for capturing social biases and stereotypes in language, along with a large dataset for training and evaluation.
Findings
Neural models achieve 80% F1 in high-level bias detection.
Models struggle to generate detailed Social Bias Frame explanations.
The study highlights the need for combining pragmatic inference with commonsense reasoning.
Abstract
Warning: this paper contains content that may be offensive or upsetting. Language has the power to reinforce stereotypes and project social biases onto others. At the core of the challenge is that it is rarely what is stated explicitly, but rather the implied meanings, that frame people's judgments about others. For example, given a statement that "we shouldn't lower our standards to hire more women," most listeners will infer the implicature intended by the speaker -- that "women (candidates) are less qualified." Most semantic formalisms, to date, do not capture such pragmatic implications in which people express social biases and power differentials in language. We introduce Social Bias Frames, a new conceptual formalism that aims to model the pragmatic frames in which people project social biases and stereotypes onto others. In addition, we introduce the Social Bias Inference…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
