Harm or Humor: A Multimodal, Multilingual Benchmark for Overt and Covert Harmful Humor
Ahmed Sharshar, Hosam Elgendy, Saad El Dine Ahmed, Yasser Rohaim, Yuxia Wang

TL;DR
This paper introduces a comprehensive, multilingual multimodal benchmark for detecting harmful and offensive humor, emphasizing cultural nuances and implicit cues, and evaluates current models' performance in understanding such complex humor types.
Contribution
It presents a novel dataset with diverse modalities and languages, along with strict annotation guidelines to distinguish safe from harmful humor, including explicit and implicit categories.
Findings
Closed-source models outperform open-source models.
Performance varies significantly between English and Arabic.
Deep reasoning is required for accurate harm detection.
Abstract
Dark humor often relies on subtle cultural nuances and implicit cues that require contextual reasoning to interpret, posing safety challenges that current static benchmarks fail to capture. To address this, we introduce a novel multimodal, multilingual benchmark for detecting and understanding harmful and offensive humor. Our manually curated dataset comprises 3,000 texts and 6,000 images in English and Arabic, alongside 1,200 videos that span English, Arabic, and language-independent (universal) contexts. Unlike standard toxicity datasets, we enforce a strict annotation guideline: distinguishing Safe jokes from Harmful ones, with the latter further classified into Explicit (overt) and Implicit (Covert) categories to probe deep reasoning. We systematically evaluate state-of-the-art (SOTA) open and closed-source models across all modalities. Our findings reveal that closed-source models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHumor Studies and Applications · Hate Speech and Cyberbullying Detection · Psychology of Moral and Emotional Judgment
