Hate Speech Detection: A Solved Problem? The Challenging Case of Long   Tail on Twitter

Ziqi Zhang; Lei Luo

arXiv:1803.03662·cs.CL·October 26, 2018·41 cites

Hate Speech Detection: A Solved Problem? The Challenging Case of Long Tail on Twitter

Ziqi Zhang, Lei Luo

PDF

Open Access 1 Repo

TL;DR

This paper investigates the challenging task of identifying specific types of hate speech on Twitter, highlighting the difficulty due to the lack of discriminative features and proposing deep neural network models that improve detection performance.

Contribution

The paper emphasizes the complexity of classifying hate speech targeting specific characteristics and introduces deep neural network architectures tailored for semantic feature extraction to enhance detection accuracy.

Findings

01

Deep neural networks outperform previous methods in hate speech detection.

02

The proposed models achieve up to 5% higher macro F1 score.

03

Hate speech detection remains challenging due to the long-tail distribution of features.

Abstract

In recent years, the increasing propagation of hate speech on social media and the urgent need for effective counter-measures have drawn significant investment from governments, companies, and researchers. A large number of methods have been developed for automated hate speech detection online. This aims to classify textual content into non-hate or hate speech, in which case the method may also identify the targeting characteristics (i.e., types of hate, such as race, and religion) in the hate speech. However, we notice significant difference between the performance of the two (i.e., non-hate v.s. hate). In this work, we argue for a focus on the latter problem for practical reasons. We show that it is a much more challenging task, as our analysis of the language in the typical datasets shows that hate speech lacks unique, discriminative features and therefore is found in the 'long tail'…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ziqizhang/chase
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Internet Traffic Analysis and Secure E-voting · Spam and Phishing Detection