Towards Generalizable Generic Harmful Speech Datasets for Implicit Hate Speech Detection
Saad Almohaimeed, Saleh Almohaimeed, Damla Turgut, Ladislau B\"ol\"oni

TL;DR
This paper introduces a novel approach to detect implicit hate speech by reannotating and augmenting existing datasets, significantly improving detection performance and generalizability across diverse social media harmful speech datasets.
Contribution
It presents a new method combining influential sample identification, reannotation, and augmentation with large language models to enhance implicit hate speech detection.
Findings
Achieved a +12.9 F1 score improvement over baseline.
Demonstrated better generalization across datasets.
Validated effectiveness of reannotation and augmentation techniques.
Abstract
Implicit hate speech has recently emerged as a critical challenge for social media platforms. While much of the research has traditionally focused on harmful speech in general, the need for generalizable techniques to detect veiled and subtle forms of hate has become increasingly pressing. Based on lexicon analysis, we hypothesize that implicit hate speech is already present in publicly available harmful speech datasets but may not have been explicitly recognized or labeled by annotators. Additionally, crowdsourced datasets are prone to mislabeling due to the complexity of the task and often influenced by annotators' subjective interpretations. In this paper, we propose an approach to address the detection of implicit hate speech and enhance generalizability across diverse datasets by leveraging existing harmful speech datasets. Our method comprises three key components: influential…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Bullying, Victimization, and Aggression · Spam and Phishing Detection
