!Qu\'e maravilla! Multimodal Sarcasm Detection in Spanish: a Dataset and   a Baseline

Khalid Alnajjar; Mika H\"am\"al\"ainen

arXiv:2105.05542·cs.CL·May 13, 2021

!Qu\'e maravilla! Multimodal Sarcasm Detection in Spanish: a Dataset and a Baseline

Khalid Alnajjar, Mika H\"am\"al\"ainen

PDF

TL;DR

This paper introduces the first multimodal sarcasm dataset for Spanish, combining text, audio, and video, and demonstrates that multimodal models outperform text-only models in sarcasm detection.

Contribution

It creates a novel multimodal sarcasm dataset for Spanish and establishes baseline models showing the benefit of combining modalities.

Findings

01

Text-only sarcasm detection achieves 89% accuracy.

02

Adding audio improves accuracy to 91.9%.

03

Combining text, audio, and video yields 93.1% accuracy.

Abstract

We construct the first ever multimodal sarcasm dataset for Spanish. The audiovisual dataset consists of sarcasm annotated text that is aligned with video and audio. The dataset represents two varieties of Spanish, a Latin American variety and a Peninsular Spanish variety, which ensures a wider dialectal coverage for this global language. We present several models for sarcasm detection that will serve as baselines in the future research. Our results show that results with text only (89%) are worse than when combining text with audio (91.9%). Finally, the best results are obtained when combining all the modalities: text, audio and video (93.1%).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Methods7 Fastest Ways to Call American Airlines Reservations Number (USA Guide)