KTCR: Improving Implicit Hate Detection with Knowledge Transfer driven Concept Refinement
Samarth Garg, Vivek Hruday Kavuri, Gargi Shroff, Rahul Mishra

TL;DR
This paper introduces a Knowledge Transfer-driven Concept Refinement method that enhances implicit hate detection by refining hate-related concepts and selectively augmenting data, leading to improved performance and better generalization across datasets.
Contribution
It proposes a novel concept refinement approach using prototype alignment and concept losses to improve implicit hate detection beyond existing data augmentation methods.
Findings
Enhanced detection accuracy on implicit hate datasets
Improved cross-dataset generalization capabilities
Outperformed baseline models with refined concept techniques
Abstract
The constant shifts in social and political contexts, driven by emerging social movements and political events, lead to new forms of hate content and previously unrecognized hate patterns that machine learning models may not have captured. Some recent literature proposes data augmentation-based techniques to enrich existing hate datasets by incorporating samples that reveal new implicit hate patterns. This approach aims to improve the model's performance on out-of-domain implicit hate instances. It is observed, that further addition of more samples for augmentation results in the decrease of the performance of the model. In this work, we propose a Knowledge Transfer-driven Concept Refinement method that distills and refines the concepts related to implicit hate samples through novel prototype alignment and concept losses, alongside data augmentation based on concept activation vectors.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection
