Explain the Flag: Contextualizing Hate Speech Beyond Censorship

Jason Liartis; Eirini Kaldeli; Lambrini Gyftokosta; Eleftherios Chelioudakis; Orfeas Menis Mastromichalakis

arXiv:2604.14970·cs.CL·April 17, 2026

Explain the Flag: Contextualizing Hate Speech Beyond Censorship

Jason Liartis, Eirini Kaldeli, Lambrini Gyftokosta, Eleftherios Chelioudakis, Orfeas Menis Mastromichalakis

PDF

TL;DR

This paper introduces a hybrid system combining Large Language Models and curated vocabularies to detect and explain hate speech across multiple languages, enhancing transparency and accountability.

Contribution

The paper presents a novel hybrid approach that integrates LLMs with curated vocabularies for multilingual hate speech detection and explanation, improving transparency over existing methods.

Findings

01

High accuracy in hate speech detection across languages

02

Generated explanations effectively clarify why content is flagged

03

Outperforms LLM-only baseline systems in human evaluations

Abstract

Hate, derogatory, and offensive speech remains a persistent challenge in online platforms and public discourse. While automated detection systems are widely used, most focus on censorship or removal, raising concerns for transparency and freedom of expression, and limiting opportunities to explain why content is harmful. To address these issues, explanatory approaches have emerged as a promising solution, aiming to make hate speech detection more transparent, accountable, and informative. In this paper, we present a hybrid approach that combines Large Language Models (LLMs) with three newly created and curated vocabularies to detect and explain hate speech in English, French, and Greek. Our system captures both inherently derogatory expressions tied to identity characteristics and direct group-targeted content through two complementary pipelines: one that detects and disambiguates…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.