Disentangling Hate in Online Memes
Rui Cao, Ziqing Fan, Roy Ka-Wei Lee, Wen-Haw Chong, Jing Jiang

TL;DR
This paper introduces DisMultiHate, a novel framework for classifying multimodal hateful memes by disentangling target entities, improving accuracy and explainability over existing methods through extensive experiments and case studies.
Contribution
DisMultiHate is the first framework to disentangle target entities in memes, enhancing both classification performance and interpretability in multimodal hateful content detection.
Findings
DisMultiHate outperforms state-of-the-art baselines in hateful meme classification.
The framework improves explainability by disentangling target entities.
Experimental results demonstrate its effectiveness across datasets.
Abstract
Hateful and offensive content detection has been extensively explored in a single modality such as text. However, such toxic information could also be communicated via multimodal content such as online memes. Therefore, detecting multimodal hateful content has recently garnered much attention in academic and industry research communities. This paper aims to contribute to this emerging research topic by proposing DisMultiHate, which is a novel framework that performed the classification of multimodal hateful content. Specifically, DisMultiHate is designed to disentangle target entities in multimodal memes to improve hateful content classification and explainability. We conduct extensive experiments on two publicly available hateful and offensive memes datasets. Our experiment results show that DisMultiHate is able to outperform state-of-the-art unimodal and multimodal baselines in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
