A Web of Hate: Tackling Hateful Speech in Online Social Spaces
Haji Mohammad Saleem, Kelly P Dillon, Susan Benesch, Derek Ruths

TL;DR
This paper introduces a novel method for detecting hateful speech in online social platforms by leveraging content from self-identifying hateful communities, overcoming limitations of keyword-based detection and improving accuracy.
Contribution
The paper proposes a community-based training approach for hateful speech detection that outperforms existing keyword-based methods across multiple platforms.
Findings
Significantly better detection accuracy than keyword-based methods
Effective across various social media platforms
Reduces reliance on expensive manual annotation
Abstract
Online social platforms are beset with hateful speech - content that expresses hatred for a person or group of people. Such content can frighten, intimidate, or silence platform users, and some of it can inspire other users to commit violence. Despite widespread recognition of the problems posed by such content, reliable solutions even for detecting hateful speech are lacking. In the present work, we establish why keyword-based methods are insufficient for detection. We then propose an approach to detecting hateful speech that uses content produced by self-identifying hateful communities as training data. Our approach bypasses the expensive annotation process often required to train keyword systems and performs well across several established platforms, making substantial improvements over current state-of-the-art approaches.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Social Media and Politics · Bullying, Victimization, and Aggression
