Hate Speech Dataset from a White Supremacy Forum

Ona de Gibert; Naiara Perez; Aitor Garc\'ia-Pablos; Montse Cuadros

arXiv:1809.04444·cs.CL·September 13, 2018

Hate Speech Dataset from a White Supremacy Forum

Ona de Gibert, Naiara Perez, Aitor Garc\'ia-Pablos, Montse Cuadros

PDF

2 Repos

TL;DR

This paper introduces a new hate speech dataset from a white supremacist forum, detailing its creation, annotation process, and baseline classification experiments, to aid research in hate speech detection.

Contribution

It presents a manually labeled hate speech dataset from Stormfront, along with an annotation tool and baseline classification results, advancing resources for hate speech detection research.

Findings

01

The dataset contains thousands of labeled sentences.

02

Baseline models achieve varying levels of accuracy.

03

The dataset is publicly available for research use.

Abstract

Hate speech is commonly defined as any communication that disparages a target group of people based on some characteristic such as race, colour, ethnicity, gender, sexual orientation, nationality, religion, or other characteristic. Due to the massive rise of user-generated web content on social media, the amount of hate speech is also steadily increasing. Over the past years, interest in online hate speech detection and, particularly, the automation of this task has continuously grown, along with the societal impact of the phenomenon. This paper describes a hate speech dataset composed of thousands of sentences manually labelled as containing hate speech or not. The sentences have been extracted from Stormfront, a white supremacist forum. A custom annotation tool has been developed to carry out the manual labelling task which, among other things, allows the annotators to choose whether…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.