Towards Fairness Assessment of Dutch Hate Speech Detection

Julie Bauer; Rishabh Kaushal; Thales Bertaglia; Adriana Iamnitchi

arXiv:2506.12502·cs.CL·June 17, 2025

Towards Fairness Assessment of Dutch Hate Speech Detection

Julie Bauer, Rishabh Kaushal, Thales Bertaglia, Adriana Iamnitchi

PDF

Open Access

TL;DR

This paper evaluates the fairness of Dutch hate speech detection models, using counterfactual data generation and fairness metrics to identify challenges and suggest improvements for model fairness and performance.

Contribution

It introduces a Dutch social group term list, generates counterfactual data with LLMs, and assesses transformer models' fairness, addressing a gap in Dutch hate speech detection research.

Findings

01

Models perform better on hate speech detection and fairness metrics.

02

Counterfactual data generation faces challenges with Dutch grammar.

03

Fairness improvements are achievable with counterfactual training.

Abstract

Numerous studies have proposed computational methods to detect hate speech online, yet most focus on the English language and emphasize model development. In this study, we evaluate the counterfactual fairness of hate speech detection models in the Dutch language, specifically examining the performance and fairness of transformer-based models. We make the following key contributions. First, we curate a list of Dutch Social Group Terms that reflect social context. Second, we generate counterfactual data for Dutch hate speech using LLMs and established strategies like Manual Group Substitution (MGS) and Sentence Log-Likelihood (SLL). Through qualitative evaluation, we highlight the challenges of generating realistic counterfactuals, particularly with Dutch grammar and contextual coherence. Third, we fine-tune baseline transformer-based models with counterfactual data and evaluate their…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection

MethodsFocus