Robust Adaptation of Large Multimodal Models for Retrieval Augmented Hateful Meme Detection
Jingbiao Mei, Jinghong Chen, Guangyu Yang, Weizhe Lin, Bill Byrne

TL;DR
This paper introduces a robust adaptation framework for large multimodal models to improve hateful meme detection, enhancing accuracy, generalization, robustness, and interpretability across multiple datasets.
Contribution
It proposes a novel adaptation method that significantly improves out-of-domain generalization and robustness of LMMs for hateful meme detection, surpassing existing fine-tuning approaches.
Findings
Achieves state-of-the-art performance on six datasets
Demonstrates improved robustness against adversarial attacks
Provides higher-quality rationales for model explanations
Abstract
Hateful memes have become a significant concern on the Internet, necessitating robust automated detection systems. While Large Multimodal Models (LMMs) have shown promise in hateful meme detection, they face notable challenges like sub-optimal performance and limited out-of-domain generalization capabilities. Recent studies further reveal the limitations of both supervised fine-tuning (SFT) and in-context learning when applied to LMMs in this setting. To address these issues, we propose a robust adaptation framework for hateful meme detection that enhances in-domain accuracy and cross-domain generalization while preserving the general vision-language capabilities of LMMs. Analysis reveals that our approach achieves improved robustness under adversarial attacks compared to SFT models. Experiments on six meme classification datasets show that our approach achieves state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Sentiment Analysis and Opinion Mining · Misinformation and Its Impacts
MethodsShrink and Fine-Tune · Contrastive Learning
