iSarcasm: A Dataset of Intended Sarcasm

Silviu Oprea; Walid Magdy

arXiv:1911.03123·cs.CL·May 5, 2020

iSarcasm: A Dataset of Intended Sarcasm

Silviu Oprea, Walid Magdy

PDF

TL;DR

This paper introduces the iSarcasm dataset of tweets labeled for intended sarcasm by authors, highlighting the challenges in detecting sarcasm and the limitations of existing models and datasets.

Contribution

The creation of the iSarcasm dataset focusing on intended sarcasm and the demonstration of low performance of current models on this dataset.

Findings

01

Existing sarcasm detection models perform poorly on iSarcasm.

02

Previous datasets may be biased or too obvious for sarcasm detection.

03

Detecting intended sarcasm remains a challenging problem in NLP.

Abstract

We consider the distinction between intended and perceived sarcasm in the context of textual sarcasm detection. The former occurs when an utterance is sarcastic from the perspective of its author, while the latter occurs when the utterance is interpreted as sarcastic by the audience. We show the limitations of previous labelling methods in capturing intended sarcasm and introduce the iSarcasm dataset of tweets labeled for sarcasm directly by their authors. Examining the state-of-the-art sarcasm detection models on our dataset showed low performance compared to previously studied datasets, which indicates that these datasets might be biased or obvious and sarcasm could be a phenomenon under-studied computationally thus far. By providing the iSarcasm dataset, we aim to encourage future NLP research to develop methods for detecting sarcasm in text as intended by the authors of the text,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.