Memes in the Wild: Assessing the Generalizability of the Hateful Memes   Challenge Dataset

Hannah Rose Kirk; Yennie Jun; Paulius Rauba; Gal Wachtel; Ruining Li,; Xingjian Bai; Noah Broestl; Martin Doff-Sotta; Aleksandar Shtedritski; Yuki; M. Asano

arXiv:2107.04313·cs.CV·July 12, 2021

Memes in the Wild: Assessing the Generalizability of the Hateful Memes Challenge Dataset

Hannah Rose Kirk, Yennie Jun, Paulius Rauba, Gal Wachtel, Ruining Li,, Xingjian Bai, Noah Broestl, Martin Doff-Sotta, Aleksandar Shtedritski, Yuki, M. Asano

PDF

TL;DR

This paper evaluates the generalizability of hateful meme detection models trained on a Facebook dataset by testing them on real-world memes from Pinterest, highlighting challenges like OCR noise and meme diversity.

Contribution

It provides an empirical assessment of how well existing models perform on memes in the wild, revealing key differences and limitations.

Findings

01

OCR noise reduces model performance

02

Memes in the wild are more diverse

03

Current benchmarks may not reflect real-world scenarios

Abstract

Hateful memes pose a unique challenge for current machine learning systems because their message is derived from both text- and visual-modalities. To this effect, Facebook released the Hateful Memes Challenge, a dataset of memes with pre-extracted text captions, but it is unclear whether these synthetic examples generalize to `memes in the wild'. In this paper, we collect hateful and non-hateful memes from Pinterest to evaluate out-of-sample performance on models pre-trained on the Facebook dataset. We find that memes in the wild differ in two key aspects: 1) Captions must be extracted via OCR, injecting noise and diminishing performance of multimodal models, and 2) Memes are more diverse than `traditional memes', including screenshots of conversations or text on a plain background. This paper thus serves as a reality check for the current benchmark of hateful meme detection and its…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.