TL;DR
This paper investigates whether context influences human judgments of toxicity and if incorporating context improves toxicity detection models, finding that context affects perception but does not enhance classifier performance.
Contribution
It provides empirical evidence that context impacts toxicity perception and highlights the need for larger, context-annotated datasets for better toxicity detection.
Findings
Context can change toxicity labels in 5% of cases.
No significant performance gain from context-aware models.
Context influences human judgment of toxicity.
Abstract
Moderation is crucial to promoting healthy on-line discussions. Although several `toxicity' detection datasets and models have been published, most of them ignore the context of the posts, implicitly assuming that comments maybe judged independently. We investigate this assumption by focusing on two questions: (a) does context affect the human judgement, and (b) does conditioning on context improve performance of toxicity detection systems? We experiment with Wikipedia conversations, limiting the notion of context to the previous post in the thread and the discussion title. We find that context can both amplify or mitigate the perceived toxicity of posts. Moreover, a small but significant subset of manually labeled posts (5% in one of our experiments) end up having the opposite toxicity labels if the annotators are not provided with context. Surprisingly, we also find no evidence that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
