Multi-source Semantic Graph-based Multimodal Sarcasm Explanation Generation
Liqiang Jing, Xuemeng Song, Kun Ouyang, Mengzhao Jia, Liqiang Nie

TL;DR
This paper introduces TEAM, a novel multimodal sarcasm explanation model that leverages multi-source semantic graphs, object metadata, and external knowledge to generate more accurate explanations for sarcastic social media posts.
Contribution
TEAM integrates object-level metadata and external knowledge into a semantic graph to improve sarcasm explanation generation, addressing limitations of previous models.
Findings
TEAM outperforms existing methods on the MORE dataset.
Semantic graph modeling enhances sarcasm reasoning.
Incorporating external knowledge improves explanation quality.
Abstract
Multimodal Sarcasm Explanation (MuSE) is a new yet challenging task, which aims to generate a natural language sentence for a multimodal social post (an image as well as its caption) to explain why it contains sarcasm. Although the existing pioneer study has achieved great success with the BART backbone, it overlooks the gap between the visual feature space and the decoder semantic space, the object-level metadata of the image, as well as the potential external knowledge. To solve these limitations, in this work, we propose a novel mulTi-source sEmantic grAph-based Multimodal sarcasm explanation scheme, named TEAM. In particular, TEAM extracts the object-level semantic meta-data instead of the traditional global visual features from the input image. Meanwhile, TEAM resorts to ConceptNet to obtain the external related knowledge concepts for the input text and the extracted object…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques
MethodsAttention Is All You Need · Linear Layer · Adam · Byte Pair Encoding · Refunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Residual Connection · Softmax · Dense Connections · Dropout
