A Modular Taxonomy for Hate Speech Definitions and Its Impact on Zero-Shot LLM Classification Performance

Matteo Melis; Gabriella Lapesa; Dennis Assenmacher

arXiv:2506.18576·cs.CL·June 24, 2025

A Modular Taxonomy for Hate Speech Definitions and Its Impact on Zero-Shot LLM Classification Performance

Matteo Melis, Gabriella Lapesa, Dennis Assenmacher

PDF

1 Repo 1 Video

TL;DR

This paper develops a taxonomy of hate speech definitions and investigates how different definitions influence zero-shot classification performance of large language models across various datasets.

Contribution

It introduces a structured taxonomy of 14 hate speech definition elements and systematically evaluates their impact on LLM zero-shot classification performance.

Findings

01

Definition specificity affects model performance.

02

Impact varies across different LLM architectures.

03

Systematic analysis of hate speech definitions enhances understanding.

Abstract

Detecting harmful content is a crucial task in the landscape of NLP applications for Social Good, with hate speech being one of its most dangerous forms. But what do we mean by hate speech, how can we define it, and how does prompting different definitions of hate speech affect model performance? The contribution of this work is twofold. At the theoretical level, we address the ambiguity surrounding hate speech by collecting and analyzing existing definitions from the literature. We organize these definitions into a taxonomy of 14 Conceptual Elements-building blocks that capture different aspects of hate speech definitions, such as references to the target of hate (individual or groups) or of the potential consequences of it. At the experimental level, we employ the collection of definitions in a systematic zero-shot evaluation of three LLMs, on three hate speech datasets representing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

matteo-mls/modular-taxonomy-for-hate-speech-definitions
noneOfficial

Videos

A Modular Taxonomy for Hate Speech Definitions and Its Impact on Zero-Shot LLM Classification Performance· underline