Cross-Platform Hate Speech Detection with Weakly Supervised Causal Disentanglement
Paras Sheth, Tharindu Kumarage, Raha Moraffah, Aman Chadha, Huan Liu

TL;DR
HATE WATCH is a weakly supervised causal disentanglement framework that improves cross-platform hate speech detection by effectively separating universal hate indicators from platform-specific features without relying on explicit labels.
Contribution
It introduces a novel weakly supervised causal disentanglement method that does not require explicit target labels, enhancing adaptability across evolving social media platforms.
Findings
Outperforms existing methods on multiple platforms.
Effective in scenarios with and without target labels.
Advances scalable content moderation techniques.
Abstract
Content moderation faces a challenging task as social media's ability to spread hate speech contrasts with its role in promoting global connectivity. With rapidly evolving slang and hate speech, the adaptability of conventional deep learning to the fluid landscape of online dialogue remains limited. In response, causality inspired disentanglement has shown promise by segregating platform specific peculiarities from universal hate indicators. However, its dependency on available ground truth target labels for discerning these nuances faces practical hurdles with the incessant evolution of platforms and the mutable nature of hate speech. Using confidence based reweighting and contrastive regularization, this study presents HATE WATCH, a novel framework of weakly supervised causal disentanglement that circumvents the need for explicit target labeling and effectively disentangles input…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Internet Traffic Analysis and Secure E-voting
