Characterizing and Detecting Hateful Users on Twitter
Manoel Horta Ribeiro, Pedro H. Calais, Yuri A. Santos, Virg\'ilio A., F. Almeida, Wagner Meira Jr

TL;DR
This paper introduces a user-centric approach to detect hateful users on Twitter by analyzing their network and activity patterns, outperforming content-based methods with a semi-supervised graph learning technique.
Contribution
It develops a novel methodology for annotating hateful users without relying on hate lexicons and demonstrates the effectiveness of graph-based semi-supervised learning for hate speech detection.
Findings
Hateful users are densely connected in Twitter's network.
Graph-based embeddings outperform content-based approaches in detection accuracy.
Hateful users show distinct activity and word usage patterns.
Abstract
Most current approaches to characterize and detect hate speech focus on \textit{content} posted in Online Social Networks. They face shortcomings to collect and annotate hateful speech due to the incompleteness and noisiness of OSN text and the subjectivity of hate speech. These limitations are often aided with constraints that oversimplify the problem, such as considering only tweets containing hate-related words. In this work we partially address these issues by shifting the focus towards \textit{users}. We develop and employ a robust methodology to collect and annotate hateful users which does not depend directly on lexicon and where the users are annotated given their entire profile. This results in a sample of Twitter's retweet graph containing users, out of which were annotated. We also collect the users who were banned in the three months that followed the data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Social Media and Politics · Internet Traffic Analysis and Secure E-voting
