MemeGuard: An LLM and VLM-based Framework for Advancing Content Moderation via Meme Intervention
Prince Jha, Raghav Jain, Konika Mandal, Aman Chadha, Sriparna Saha,, Pushpak Bhattacharyya

TL;DR
MemeGuard introduces a multimodal framework utilizing LLMs and VLMs to enhance meme content moderation through interpretation and intervention, supported by a new dataset of toxic memes and interventions.
Contribution
The paper presents MemeGuard, a novel multimodal framework combining VLMs and LLMs for meme intervention, and introduces the ICMM dataset for toxic meme intervention evaluation.
Findings
MemeGuard effectively generates contextually appropriate interventions for toxic memes.
The ICMM dataset provides high-quality annotations for meme intervention research.
Experimental results show MemeGuard outperforms baseline methods in intervention relevance.
Abstract
In the digital world, memes present a unique challenge for content moderation due to their potential to spread harmful content. Although detection methods have improved, proactive solutions such as intervention are still limited, with current research focusing mostly on text-based content, neglecting the widespread influence of multimodal content like memes. Addressing this gap, we present \textit{MemeGuard}, a comprehensive framework leveraging Large Language Models (LLMs) and Visual Language Models (VLMs) for meme intervention. \textit{MemeGuard} harnesses a specially fine-tuned VLM, \textit{VLMeme}, for meme interpretation, and a multimodal knowledge selection and ranking mechanism (\textit{MKS}) for distilling relevant knowledge. This knowledge is then employed by a general-purpose LLM to generate contextually appropriate interventions. Another key contribution of this work is the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Social Media and Politics · Digital Games and Media
