See, Explain, and Intervene: A Few-Shot Multimodal Agent Framework for Hateful Meme Moderation

Naquee Rizwan; Subhankar Swain; Paramananda Bhaskar; Gagan Aryan; Shehryaar Shah Khan; Animesh Mukherjee

arXiv:2601.04692·cs.CL·January 9, 2026

See, Explain, and Intervene: A Few-Shot Multimodal Agent Framework for Hateful Meme Moderation

Naquee Rizwan, Subhankar Swain, Paramananda Bhaskar, Gagan Aryan, Shehryaar Shah Khan, Animesh Mukherjee

PDF

Open Access

TL;DR

This paper introduces a novel multimodal agent framework for hateful meme moderation that integrates detection, explanation, and intervention, leveraging few-shot learning with generative AI to operate effectively with limited data.

Contribution

It presents the first unified framework combining detection, explanation, and intervention for hateful memes using multimodal models with few-shot capabilities.

Findings

01

Effective detection of hateful memes with limited data

02

Generative models provide explanations for meme content

03

Proposed framework demonstrates strong potential for real-world deployment

Abstract

In this work, we examine hateful memes from three complementary angles - how to detect them, how to explain their content and how to intervene them prior to being posted - by applying a range of strategies built on top of generative AI models. To the best of our knowledge, explanation and intervention have typically been studied separately from detection, which does not reflect real-world conditions. Further, since curating large annotated datasets for meme moderation is prohibitively expensive, we propose a novel framework that leverages task-specific generative multimodal agents and the few-shot adaptability of large multimodal models to cater to different types of memes. We believe this is the first work focused on generalizable hateful meme moderation under limited data conditions, and has strong potential for deployment in real-world production scenarios. Warning: Contains…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Misinformation and Its Impacts · Psychology of Moral and Emotional Judgment