SAFE-MEME: Structured Reasoning Framework for Robust Hate Speech   Detection in Memes

Palash Nandi; Shivam Sharma; Tanmoy Chakraborty

arXiv:2412.20541·cs.CL·December 31, 2024

SAFE-MEME: Structured Reasoning Framework for Robust Hate Speech Detection in Memes

Palash Nandi, Shivam Sharma, Tanmoy Chakraborty

PDF

Open Access 1 Repo

TL;DR

SAFE-MEME introduces a structured reasoning framework and new datasets for detecting nuanced hate speech in memes, significantly improving robustness and accuracy over existing methods.

Contribution

The paper presents SAFE-MEME, a novel multimodal reasoning framework with hierarchical categorization and new datasets for fine-grained hate speech detection in memes.

Findings

01

SAFE-MEME-QA improves detection accuracy by ~5-6%.

02

SAFE-MEME-H outperforms baselines in regular scenarios.

03

Fine-tuning adapters can outperform full fine-tuning in certain cases.

Abstract

Memes act as cryptic tools for sharing sensitive ideas, often requiring contextual knowledge to interpret. This makes moderating multimodal memes challenging, as existing works either lack high-quality datasets on nuanced hate categories or rely on low-quality social media visuals. Here, we curate two novel multimodal hate speech datasets, MHS and MHS-Con, that capture fine-grained hateful abstractions in regular and confounding scenarios, respectively. We benchmark these datasets against several competing baselines. Furthermore, we introduce SAFE-MEME (Structured reAsoning FramEwork), a novel multimodal Chain-of-Thought-based framework employing Q&A-style reasoning (SAFE-MEME-QA) and hierarchical categorization (SAFE-MEME-H) to enable robust hate speech detection in memes. SAFE-MEME-QA outperforms existing baselines, achieving an average improvement of approximately 5% and 4% on MHS…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

PalGitts/SAFE-MEME
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Sentiment Analysis and Opinion Mining

MethodsAdapter