Nice perfume. How long did you marinate in it? Multimodal Sarcasm Explanation
Poorav Desai, Tanmoy Chakraborty, Md Shad Akhtar

TL;DR
This paper introduces MuSE, a new dataset and model for generating natural language explanations of sarcasm in multimodal posts, combining images and text, to improve interpretability of sarcasm detection.
Contribution
It proposes the novel task of Multimodal Sarcasm Explanation (MuSE), creates the first dataset with explanations, and benchmarks a Transformer-based model for this purpose.
Findings
The model outperforms baselines across multiple metrics.
Human evaluation shows moderate agreement (Fleiss' Kappa 0.4).
Empirical results demonstrate the effectiveness of cross-modal attention.
Abstract
Sarcasm is a pervading linguistic phenomenon and highly challenging to explain due to its subjectivity, lack of context and deeply-felt opinion. In the multimodal setup, sarcasm is conveyed through the incongruity between the text and visual entities. Although recent approaches deal with sarcasm as a classification problem, it is unclear why an online post is identified as sarcastic. Without proper explanation, end users may not be able to perceive the underlying sense of irony. In this paper, we propose a novel problem -- Multimodal Sarcasm Explanation (MuSE) -- given a multimodal sarcastic post containing an image and a caption, we aim to generate a natural language explanation to reveal the intended sarcasm. To this end, we develop MORE, a new dataset with explanation of 3510 sarcastic multimodal posts. Each explanation is a natural language (English) sentence describing the hidden…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Language, Metaphor, and Cognition · Advanced Text Analysis Techniques
