Cracking the Code: Enhancing Implicit Hate Speech Detection through Coding Classification
Lu Wei, Liangzhi Li, Tong Xiang, Xiao Liu, Noa Garcia

TL;DR
This paper introduces a novel taxonomy and encoding strategies for improving implicit hate speech detection using large language models, demonstrating effectiveness across multiple languages.
Contribution
It proposes a new taxonomy of codetypes and two methods for integrating them into LLM-based detection, enhancing accuracy for implicit hate speech.
Findings
Codetypes improve detection accuracy in Chinese and English datasets.
Prompting LLMs with codetypes enhances implicit hate speech classification.
The approach is effective across different languages.
Abstract
The internet has become a hotspot for hate speech (HS), threatening societal harmony and individual well-being. While automatic detection methods perform well in identifying explicit hate speech (ex-HS), they struggle with more subtle forms, such as implicit hate speech (im-HS). We tackle this problem by introducing a new taxonomy for im-HS detection, defining six encoding strategies named codetypes. We present two methods for integrating codetypes into im-HS detection: 1) prompting large language models (LLMs) directly to classify sentences based on generated responses, and 2) using LLMs as encoders with codetypes embedded during the encoding process. Experiments show that the use of codetypes improves im-HS detection in both Chinese and English datasets, validating the effectiveness of our approach across different languages.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Emotion and Mood Recognition · Sentiment Analysis and Opinion Mining
