Automated Hate Speech Detection and the Problem of Offensive Language
Thomas Davidson, Dana Warmsley, Michael Macy, Ingmar Weber

TL;DR
This paper addresses the challenge of accurately distinguishing hate speech from offensive language on social media by developing a multi-class classifier trained on crowd-sourced labeled data, revealing specific difficulties in classification.
Contribution
It introduces a novel multi-class classification approach using crowd-sourced labels to differentiate hate speech from offensive language, improving upon lexical and previous supervised methods.
Findings
Racist and homophobic tweets are more reliably classified as hate speech.
Sexist tweets tend to be classified as offensive rather than hate speech.
Tweets without explicit hate keywords are more challenging to classify.
Abstract
A key challenge for automatic hate-speech detection on social media is the separation of hate speech from other instances of offensive language. Lexical detection methods tend to have low precision because they classify all messages containing particular terms as hate speech and previous work using supervised learning has failed to distinguish between the two categories. We used a crowd-sourced hate speech lexicon to collect tweets containing hate speech keywords. We use crowd-sourcing to label a sample of these tweets into three categories: those containing hate speech, only offensive language, and those with neither. We train a multi-class classifier to distinguish between these different categories. Close analysis of the predictions and the errors shows when we can reliably separate hate speech from other offensive language and when this differentiation is more difficult. We find…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗unitary/multilingual-toxic-xlm-robertamodel· 41k dl· ♡ 2841k dl♡ 28
- 🤗unitary/toxic-bertmodel· 292k dl· ♡ 216292k dl♡ 216
- 🤗unitary/unbiased-toxic-robertamodel· 127k dl· ♡ 28127k dl♡ 28
- 🤗renatogm24/multilingual-toxic-xlm-robertamodel· 7 dl7 dl
- 🤗lordofthejars/toxic-bertmodel
- 🤗ramsleeb/toxic-bertmodel· 33 dl33 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Spam and Phishing Detection · Internet Traffic Analysis and Secure E-voting
