See, Explain, and Intervene: A Few-Shot Multimodal Agent Framework for Hateful Meme Moderation
Naquee Rizwan, Subhankar Swain, Paramananda Bhaskar, Gagan Aryan, Shehryaar Shah Khan, Animesh Mukherjee

TL;DR
This paper introduces a novel multimodal agent framework for hateful meme moderation that integrates detection, explanation, and intervention, leveraging few-shot learning with generative AI to operate effectively with limited data.
Contribution
It presents the first unified framework combining detection, explanation, and intervention for hateful memes using multimodal models with few-shot capabilities.
Findings
Effective detection of hateful memes with limited data
Generative models provide explanations for meme content
Proposed framework demonstrates strong potential for real-world deployment
Abstract
In this work, we examine hateful memes from three complementary angles - how to detect them, how to explain their content and how to intervene them prior to being posted - by applying a range of strategies built on top of generative AI models. To the best of our knowledge, explanation and intervention have typically been studied separately from detection, which does not reflect real-world conditions. Further, since curating large annotated datasets for meme moderation is prohibitively expensive, we propose a novel framework that leverages task-specific generative multimodal agents and the few-shot adaptability of large multimodal models to cater to different types of memes. We believe this is the first work focused on generalizable hateful meme moderation under limited data conditions, and has strong potential for deployment in real-world production scenarios. Warning: Contains…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Misinformation and Its Impacts · Psychology of Moral and Emotional Judgment
