Hate Speech Detection using Large Language Models with Data Augmentation and Feature Enhancement

Brian Jing Hong Nge; Stefan Su; Thanh Thi Nguyen; Campbell Wilson; Alexandra Phelan; Naomi Pfitzner

arXiv:2603.04698·cs.CL·March 6, 2026

Hate Speech Detection using Large Language Models with Data Augmentation and Feature Enhancement

Brian Jing Hong Nge, Stefan Su, Thanh Thi Nguyen, Campbell Wilson, Alexandra Phelan, Naomi Pfitzner

PDF

Open Access

TL;DR

This study compares traditional and transformer-based models for hate speech detection, demonstrating how data augmentation and feature enhancement techniques influence performance across various datasets and models.

Contribution

It introduces a comprehensive evaluation of data augmentation and feature enhancement methods for hate speech detection using both traditional and transformer models.

Findings

01

gpt-oss-20b achieves highest accuracy among models

02

Delta TF-IDF responds strongly to data augmentation, reaching 98.2% accuracy

03

Implicit hate speech detection remains more challenging than explicit content

Abstract

This paper evaluates data augmentation and feature enhancement techniques for hate speech detection, comparing traditional classifiers, e.g., Delta Term Frequency-Inverse Document Frequency (Delta TF-IDF), with transformer-based models (DistilBERT, RoBERTa, DeBERTa, Gemma-7B, gpt-oss-20b) across diverse datasets. It examines the impact of Synthetic Minority Over-sampling Technique (SMOTE), weighted loss determined by inverse class proportions, Part-of-Speech (POS) tagging, and text data augmentation on model performance. The open-source gpt-oss-20b consistently achieves the highest results. On the other hand, Delta TF-IDF responds strongly to data augmentation, reaching 98.2% accuracy on the Stormfront dataset. The study confirms that implicit hate speech is more difficult to detect than explicit hateful content and that enhancement effectiveness depends on dataset, model, and technique…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Emotion and Mood Recognition · Sentiment Analysis and Opinion Mining