The Enforcement and Feasibility of Hate Speech Moderation on Twitter

Manuel Tonneau; Dylan Thurgood; Diyi Liu; Niyati Malhotra; Victor Orozco-Olvera; Ralph Schroeder; Scott A. Hale; Manoel Horta Ribeiro; Paul R\"ottger; Samuel P. Fraiberger

arXiv:2604.12289·cs.CY·April 15, 2026

The Enforcement and Feasibility of Hate Speech Moderation on Twitter

Manuel Tonneau, Dylan Thurgood, Diyi Liu, Niyati Malhotra, Victor Orozco-Olvera, Ralph Schroeder, Scott A. Hale, Manoel Horta Ribeiro, Paul R\"ottger, Samuel P. Fraiberger

PDF

TL;DR

This study audits Twitter's hate speech moderation, revealing persistent hateful content online and analyzing the technical and institutional factors affecting enforcement effectiveness.

Contribution

It provides a comprehensive global audit of hate speech enforcement on Twitter, highlighting technical limitations and institutional resource allocation issues.

Findings

01

80% of hateful tweets remain online after five months

02

Automated detection systems struggle with false positives but aid human review

03

Reducing user exposure to hate speech is economically feasible with current moderation strategies

Abstract

Online hate speech is associated with substantial social harms, yet it remains unclear how consistently platforms enforce hate speech policies or whether enforcement is feasible at scale. We address these questions through a global audit of hate speech moderation on Twitter (now X). Using a complete 24-hour snapshot of public tweets, we construct representative samples comprising 540,000 tweets annotated for hate speech by trained annotators across eight major languages. Five months after posting, 80% of hateful tweets remain online, including explicitly violent hate speech. Such tweets are no more likely to be removed than non-hateful tweets, with neither severity nor visibility increasing the likelihood of removal. We then examine whether these enforcement gaps reflect technical limits of large-scale moderation systems. While fully automated detection systems cannot reliably identify…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.