From Native Memes to Global Moderation: Cross-Cultural Evaluation of Vision-Language Models for Hateful Meme Detection

Mo Wang; Kaixuan Ren; Pratik Jalan; Ahmed Ashraf; Tuong Vy Vu; Rahul Seetharaman; Shah Nawaz; Usman Naseem

arXiv:2602.07497·cs.CL·February 13, 2026

From Native Memes to Global Moderation: Cross-Cultural Evaluation of Vision-Language Models for Hateful Meme Detection

Mo Wang, Kaixuan Ren, Pratik Jalan, Ahmed Ashraf, Tuong Vy Vu, Rahul Seetharaman, Shah Nawaz, Usman Naseem

PDF

Open Access

TL;DR

This paper evaluates how vision-language models perform in detecting hateful memes across different cultures and languages, revealing biases and proposing strategies for more globally fair moderation.

Contribution

It introduces a framework for cross-cultural evaluation of VLMs, analyzing the impact of language and learning strategies on hate detection robustness.

Findings

01

Translate-then-detect degrades performance

02

Native-language prompting improves detection

03

Cultural biases converge towards Western norms

Abstract

Cultural context profoundly shapes how people interpret online content, yet vision-language models (VLMs) remain predominantly trained through Western or English-centric lenses. This limits their fairness and cross-cultural robustness in tasks like hateful meme detection. We introduce a systematic evaluation framework designed to diagnose and quantify the cross-cultural robustness of state-of-the-art VLMs across multilingual meme datasets, analyzing three axes: (i) learning strategy (zero-shot vs. one-shot), (ii) prompting language (native vs. English), and (iii) translation effects on meaning and detection. Results show that the common ``translate-then-detect'' approach deteriorate performance, while culturally aligned interventions - native-language prompting and one-shot learning - significantly enhance detection. Our findings reveal systematic convergence toward Western safety norms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Misinformation and Its Impacts · Spam and Phishing Detection