COBRA Frames: Contextual Reasoning about Effects and Harms of Offensive Statements
Xuhui Zhou, Hao Zhu, Akhila Yerukola, and Thomas Davidson, Jena D., Hwang, Swabha Swayamdipta, Maarten Sap

TL;DR
This paper introduces COBRA frames, a formalism for understanding offensive statements within their social context, supported by a new dataset and models that demonstrate the importance of context in detecting harm and intent.
Contribution
We develop COBRA frames, the first context-aware formalism for explaining offensiveness, and create COBRACORPUS, a large dataset with context and explanations for offensive statements.
Findings
Context-aware models outperform context-agnostic ones in explaining offensiveness.
Explanations improve significantly when models incorporate social context.
Context inversion can change the perceived offensiveness of statements.
Abstract
Warning: This paper contains content that may be offensive or upsetting. Understanding the harms and offensiveness of statements requires reasoning about the social and situational context in which statements are made. For example, the utterance "your English is very good" may implicitly signal an insult when uttered by a white man to a non-white colleague, but uttered by an ESL teacher to their student would be interpreted as a genuine compliment. Such contextual factors have been largely ignored by previous approaches to toxic language detection. We introduce COBRA frames, the first context-aware formalism for explaining the intents, reactions, and harms of offensive or biased statements grounded in their social and situational context. We create COBRACORPUS, a dataset of 33k potentially offensive statements paired with machine-generated contexts and free-text explanations of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection
