Classification is a RAG problem: A case study on hate speech detection

Richard Willats; Josh Pennington; Aravind Mohan; Bertie Vidgen

arXiv:2508.06204·cs.CL·August 11, 2025

Classification is a RAG problem: A case study on hate speech detection

Richard Willats, Josh Pennington, Aravind Mohan, Bertie Vidgen

PDF

Open Access

TL;DR

This paper introduces a Retrieval-Augmented Generation approach for hate speech detection, enabling adaptable, explainable, and policy-compliant content classification without retraining.

Contribution

It presents a novel RAG-based system that improves flexibility, explainability, and policy update efficiency in content moderation tasks.

Findings

01

Achieves classification accuracy comparable to commercial systems

02

Provides inherent explainability through retrieved policy segments

03

Enables dynamic policy updates without retraining

Abstract

Robust content moderation requires classification systems that can quickly adapt to evolving policies without costly retraining. We present classification using Retrieval-Augmented Generation (RAG), which shifts traditional classification tasks from determining the correct category in accordance with pre-trained parameters to evaluating content in relation to contextual knowledge retrieved at inference. In hate speech detection, this transforms the task from "is this hate speech?" to "does this violate the hate speech policy?" Our Contextual Policy Engine (CPE) - an agentic RAG system - demonstrates this approach and offers three key advantages: (1) robust classification accuracy comparable to leading commercial systems, (2) inherent explainability via retrieved policy segments, and (3) dynamic policy updates without model retraining. Through three experiments, we demonstrate strong…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNetwork Security and Intrusion Detection · Advanced Malware Detection Techniques