Robust Adaptation of Large Multimodal Models for Retrieval Augmented Hateful Meme Detection

Jingbiao Mei; Jinghong Chen; Guangyu Yang; Weizhe Lin; Bill Byrne

arXiv:2502.13061·cs.CL·March 3, 2026

Robust Adaptation of Large Multimodal Models for Retrieval Augmented Hateful Meme Detection

Jingbiao Mei, Jinghong Chen, Guangyu Yang, Weizhe Lin, Bill Byrne

PDF

Open Access 1 Repo 2 Models 3 Datasets 1 Video

TL;DR

This paper introduces a robust adaptation framework for large multimodal models to improve hateful meme detection, enhancing accuracy, generalization, robustness, and interpretability across multiple datasets.

Contribution

It proposes a novel adaptation method that significantly improves out-of-domain generalization and robustness of LMMs for hateful meme detection, surpassing existing fine-tuning approaches.

Findings

01

Achieves state-of-the-art performance on six datasets

02

Demonstrates improved robustness against adversarial attacks

03

Provides higher-quality rationales for model explanations

Abstract

Hateful memes have become a significant concern on the Internet, necessitating robust automated detection systems. While Large Multimodal Models (LMMs) have shown promise in hateful meme detection, they face notable challenges like sub-optimal performance and limited out-of-domain generalization capabilities. Recent studies further reveal the limitations of both supervised fine-tuning (SFT) and in-context learning when applied to LMMs in this setting. To address these issues, we propose a robust adaptation framework for hateful meme detection that enhances in-domain accuracy and cross-domain generalization while preserving the general vision-language capabilities of LMMs. Analysis reveals that our approach achieves improved robustness under adversarial attacks compared to SFT models. Experiments on six meme classification datasets show that our approach achieves state-of-the-art…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

JingbiaoMei/RGCL
pytorchOfficial

Models

Datasets

Videos

Robust Adaptation of Large Multimodal Models for Retrieval Augmented Hateful Meme Detection· underline

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Sentiment Analysis and Opinion Mining · Misinformation and Its Impacts

MethodsShrink and Fine-Tune · Contrastive Learning