Towards Generalizable Generic Harmful Speech Datasets for Implicit Hate Speech Detection

Saad Almohaimeed; Saleh Almohaimeed; Damla Turgut; Ladislau B\"ol\"oni

arXiv:2506.16476·cs.CL·June 23, 2025

Towards Generalizable Generic Harmful Speech Datasets for Implicit Hate Speech Detection

Saad Almohaimeed, Saleh Almohaimeed, Damla Turgut, Ladislau B\"ol\"oni

PDF

Open Access

TL;DR

This paper introduces a novel approach to detect implicit hate speech by reannotating and augmenting existing datasets, significantly improving detection performance and generalizability across diverse social media harmful speech datasets.

Contribution

It presents a new method combining influential sample identification, reannotation, and augmentation with large language models to enhance implicit hate speech detection.

Findings

01

Achieved a +12.9 F1 score improvement over baseline.

02

Demonstrated better generalization across datasets.

03

Validated effectiveness of reannotation and augmentation techniques.

Abstract

Implicit hate speech has recently emerged as a critical challenge for social media platforms. While much of the research has traditionally focused on harmful speech in general, the need for generalizable techniques to detect veiled and subtle forms of hate has become increasingly pressing. Based on lexicon analysis, we hypothesize that implicit hate speech is already present in publicly available harmful speech datasets but may not have been explicitly recognized or labeled by annotators. Additionally, crowdsourced datasets are prone to mislabeling due to the complexity of the task and often influenced by annotators' subjective interpretations. In this paper, we propose an approach to address the detection of implicit hate speech and enhance generalizability across diverse datasets by leveraging existing harmful speech datasets. Our method comprises three key components: influential…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Bullying, Victimization, and Aggression · Spam and Phishing Detection