A Target-Aware Analysis of Data Augmentation for Hate Speech Detection
Camilla Casula, Sara Tonelli

TL;DR
This paper explores data augmentation techniques, including generative models, to improve hate speech detection, especially for underrepresented groups, demonstrating that combined methods enhance classification performance and fairness.
Contribution
It introduces a target-aware data augmentation approach using generative language models to address target imbalance in hate speech datasets.
Findings
Traditional data augmentation often outperforms generative models alone.
Combining augmentation methods yields the best classification results.
Improved F1 scores (>10%) for categories like origin, religion, and disability.
Abstract
Hate speech is one of the main threats posed by the widespread use of social networks, despite efforts to limit it. Although attention has been devoted to this issue, the lack of datasets and case studies centered around scarcely represented phenomena, such as ableism or ageism, can lead to hate speech detection systems that do not perform well on underrepresented identity groups. Given the unpreceded capabilities of LLMs in producing high-quality data, we investigate the possibility of augmenting existing data with generative language models, reducing target imbalance. We experiment with augmenting 1,000 posts from the Measuring Hate Speech corpus, an English dataset annotated with target identity information, adding around 30,000 synthetic examples using both simple data augmentation methods and different types of generative models, comparing autoregressive and sequence-to-sequence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection
MethodsSoftmax · Attention Is All You Need
