YesBut: A High-Quality Annotated Multimodal Dataset for evaluating Satire Comprehension capability of Vision-Language Models
Abhilash Nandy, Yash Agarwal, Ashish Patwa, Millon Madhur Das, Aman, Bansal, Ankit Raj, Pawan Goyal, Niloy Ganguly

TL;DR
This paper introduces YesBut, a high-quality multimodal dataset designed to evaluate vision-language models' ability to understand satire, revealing current models' limitations in zero-shot satire comprehension tasks.
Contribution
The paper presents a new challenging dataset and tasks for satire detection and understanding, highlighting the gap in current vision-language models' capabilities.
Findings
Models perform poorly on satire tasks in zero-shot settings
Current models struggle with understanding satire in multimodal contexts
The dataset enables better evaluation of satire comprehension in AI
Abstract
Understanding satire and humor is a challenging task for even current Vision-Language models. In this paper, we propose the challenging tasks of Satirical Image Detection (detecting whether an image is satirical), Understanding (generating the reason behind the image being satirical), and Completion (given one half of the image, selecting the other half from 2 given options, such that the complete image is satirical) and release a high-quality dataset YesBut, consisting of 2547 images, 1084 satirical and 1463 non-satirical, containing different artistic styles, to evaluate those tasks. Each satirical image in the dataset depicts a normal scenario, along with a conflicting scenario which is funny or ironic. Despite the success of current Vision-Language Models on multimodal tasks such as Visual QA and Image Captioning, our benchmarking experiments show that such models perform poorly on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsLanguage, Metaphor, and Cognition · Humor Studies and Applications · Swearing, Euphemism, Multilingualism
