Evaluating Simple Debiasing Techniques in RoBERTa-based Hate Speech   Detection Models

Diana Iftimie; Erik Zinn

arXiv:2501.15430·cs.CL·January 28, 2025

Evaluating Simple Debiasing Techniques in RoBERTa-based Hate Speech Detection Models

Diana Iftimie, Erik Zinn

PDF

Open Access

TL;DR

This paper evaluates simple debiasing techniques applied to RoBERTa-based hate speech detection models, showing that their effectiveness depends on dataset construction methods and can reduce dialect-based disparities.

Contribution

It systematically assesses the impact of debiasing techniques on dialect bias in hate speech detection models using RoBERTa.

Findings

01

Debiasing effectiveness varies with dataset construction methods.

02

Proper representation bias consideration improves disparity reduction.

03

Simple techniques can mitigate dialect bias with careful dataset design.

Abstract

The hate speech detection task is known to suffer from bias against African American English (AAE) dialect text, due to the annotation bias present in the underlying hate speech datasets used to train these models. This leads to a disparity where normal AAE text is more likely to be misclassified as abusive/hateful compared to non-AAE text. Simple debiasing techniques have been developed in the past to counter this sort of disparity, and in this work, we apply and evaluate these techniques in the scope of RoBERTa-based encoders. Experimental results suggest that the success of these techniques depends heavily on the methods used for training dataset construction, but with proper consideration of representation bias, they can reduce the disparity seen among dialect subgroups on the hate speech detection task.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection

Methods7 Fastest Ways to Call American Airlines Reservations Number (USA Guide)