MMSD2.0: Towards a Reliable Multi-modal Sarcasm Detection System

Libo Qin; Shijue Huang; Qiguang Chen; Chenran Cai; Yudi Zhang; Bin; Liang; Wanxiang Che; Ruifeng Xu

arXiv:2307.07135·cs.CL·July 17, 2023·1 cites

MMSD2.0: Towards a Reliable Multi-modal Sarcasm Detection System

Libo Qin, Shijue Huang, Qiguang Chen, Chenran Cai, Yudi Zhang, Bin, Liang, Wanxiang Che, Ruifeng Xu

PDF

Open Access 1 Repo

TL;DR

This paper introduces MMSD2.0, a corrected benchmark dataset for multi-modal sarcasm detection, and proposes a multi-view CLIP framework that leverages multi-grained cues from text and images, significantly improving detection reliability.

Contribution

The paper presents MMSD2.0, a refined dataset removing biases and unreasonable samples, and introduces a novel multi-view CLIP framework for enhanced multi-modal sarcasm detection.

Findings

01

MMSD2.0 outperforms previous benchmarks in reliability.

02

Multi-view CLIP significantly surpasses previous baselines.

03

The approach improves sarcasm detection accuracy across modalities.

Abstract

Multi-modal sarcasm detection has attracted much recent attention. Nevertheless, the existing benchmark (MMSD) has some shortcomings that hinder the development of reliable multi-modal sarcasm detection system: (1) There are some spurious cues in MMSD, leading to the model bias learning; (2) The negative samples in MMSD are not always reasonable. To solve the aforementioned issues, we introduce MMSD2.0, a correction dataset that fixes the shortcomings of MMSD, by removing the spurious cues and re-annotating the unreasonable samples. Meanwhile, we present a novel framework called multi-view CLIP that is capable of leveraging multi-grained cues from multiple perspectives (i.e., text, image, and text-image interaction view) for multi-modal sarcasm detection. Extensive experiments show that MMSD2.0 is a valuable benchmark for building reliable multi-modal sarcasm detection systems and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

joeying1019/mmsd2.0
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSentiment Analysis and Opinion Mining · Natural Language Processing Techniques · Text and Document Classification Technologies

MethodsContrastive Language-Image Pre-training