Indicators for characterising online hate speech and its automatic detection
Erica Forzinetti, Marco L. Della Vedova, Stefano Pasta, Milena, Santerini

TL;DR
This study compares expert human classification of hate speech indicators on Twitter with machine learning detection, revealing that algorithms are more effective at identifying content inciting hatred and violence.
Contribution
Introduces a novel seven-indicator scheme for hate speech characterization and evaluates its effectiveness against machine learning classifications.
Findings
Algorithms better detect tweets inciting hatred and violence.
Not all hate speech types are equally detectable by current algorithms.
Expert classification provides nuanced insights beyond automated methods.
Abstract
We examined four case studies in the context of hate speech on Twitter in Italian from 2019 to 2020, aiming at comparing the classification of the 3,600 tweets made by expert pedagogists with the automatic classification made by machine learning algorithms. Pedagogists used a novel classification scheme based on seven indicators that characterize hate. These indicators are: the content is public, it affects a target group, it contains hate speech in explicit verbal form, it will not redeem, it has intention to harm, it can have a possible violent response, it incites hatred and violence. The case studies refer to Jews, Muslims, Roma, and immigrants target groups. We find that not all the types of hateful content are equally detectable by the machine learning algorithms that we considered. In particular, algorithms perform better in identifying tweets that incite hatred and violence, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection
