AmpleHate: Amplifying the Attention for Versatile Implicit Hate Detection
Yejin Lee, Joonghyuk Hahn, Hyeseon Ahn, Yo-Sub Han

TL;DR
AmpleHate introduces a novel method that mimics human reasoning by identifying explicit and implicit targets and their relations to improve implicit hate speech detection, achieving state-of-the-art results and enhanced interpretability.
Contribution
It proposes AmpleHate, a new approach that captures target-context relations using attention mechanisms, outperforming existing contrastive learning methods in implicit hate detection.
Findings
Achieves 82.14% improvement over baselines.
Outperforms contrastive learning models.
Produces attention patterns aligned with human judgment.
Abstract
Implicit hate speech detection is challenging due to its subtlety and reliance on contextual interpretation rather than explicit offensive words. Current approaches rely on contrastive learning, which are shown to be effective on distinguishing hate and non-hate sentences. Humans, however, detect implicit hate speech by first identifying specific targets within the text and subsequently interpreting how these target relate to their surrounding context. Motivated by this reasoning process, we propose AmpleHate, a novel approach designed to mirror human inference for implicit hate detection. AmpleHate identifies explicit target using a pretrained Named Entity Recognition model and capture implicit target information via [CLS] tokens. It computes attention-based relationships between explicit, implicit targets and sentence context and then, directly injects these relational vectors into…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsHate Speech and Cyberbullying Detection
MethodsSoftmax · Attention Is All You Need · Contrastive Learning · ALIGN
