Read as You See: Guiding Unimodal LLMs for Low-Resource Explainable Harmful Meme Detection
Fengjun Pan, Xiaobao Wu, Tho Quan, Anh Tuan Luu

TL;DR
This paper introduces U-CoT+, a resource-efficient framework that enables unimodal LLMs to detect harmful memes by converting visual content into text and guiding reasoning with interpretable prompts, achieving competitive performance.
Contribution
The paper proposes a novel low-resource approach that decouples meme recognition from harmfulness analysis, using a meme-to-text pipeline and zero-shot CoT prompting for explainable detection.
Findings
Achieves comparable performance to resource-intensive models on benchmark datasets.
Provides transparent, step-by-step rationales for harmful meme detection.
Demonstrates effectiveness and scalability of the approach.
Abstract
Detecting harmful memes is crucial for safeguarding the integrity and harmony of online environments, yet existing detection methods are often resource-intensive, inflexible, and lacking explainability, limiting their applicability in assisting real-world web content moderation. We propose U-CoT+, a resource-efficient framework that prioritizes accessibility, flexibility and transparency in harmful meme detection by fully harnessing the capabilities of lightweight unimodal large language models (LLMs). Instead of directly prompting or fine-tuning large multimodal models (LMMs) as black-box classifiers, we avoid immediate reasoning over complex visual inputs but decouple meme content recognition from meme harmfulness analysis through a high-fidelity meme-to-text pipeline, which collaborates lightweight LMMs and LLMs to convert multimodal memes into natural language descriptions that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Misinformation and Its Impacts · Spam and Phishing Detection
