TL;DR
This paper introduces a new Hindi-English code-mixed dataset for sarcasm detection and humor classification in conversations, and proposes MSH-COMICS, an attention-based neural model that leverages contextual information for improved multi-modal classification.
Contribution
The paper presents the first code-mixed dataset for sarcasm and humor detection and introduces a novel attention-rich neural architecture for multi-modal utterance classification.
Findings
MSH-COMICS outperforms existing models by >1 F1-score in sarcasm detection.
Achieves 10 F1-score points improvement in humor classification.
Demonstrates effectiveness of hierarchical and contextual attention mechanisms.
Abstract
Sarcasm detection and humor classification are inherently subtle problems, primarily due to their dependence on the contextual and non-verbal information. Furthermore, existing studies in these two topics are usually constrained in non-English languages such as Hindi, due to the unavailability of qualitative annotated datasets. In this work, we make two major contributions considering the above limitations: (1) we develop a Hindi-English code-mixed dataset, MaSaC, for the multi-modal sarcasm detection and humor classification in conversational dialog, which to our knowledge is the first dataset of its kind; (2) we propose MSH-COMICS, a novel attention-rich neural architecture for the utterance classification. We learn efficient utterance representation utilizing a hierarchical attention mechanism that attends to a small portion of the input sentence at a time. Further, we incorporate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
