Impact of emoji exclusion on the performance of Arabic sarcasm detection   models

Ghalyah H. Aleryani; Wael Deabes; Khaled Albishre; Alaa E. Abdel-Hakim

arXiv:2405.02195·cs.CL·May 6, 2024

Impact of emoji exclusion on the performance of Arabic sarcasm detection models

Ghalyah H. Aleryani, Wael Deabes, Khaled Albishre, Alaa E. Abdel-Hakim

PDF

Open Access

TL;DR

This study explores how removing emojis from Arabic social media data affects sarcasm detection models, finding that emoji exclusion can significantly improve model accuracy and establish new NLP benchmarks.

Contribution

It introduces a novel approach of emoji exclusion in pre-training Arabic language models, enhancing sarcasm detection performance in social media analysis.

Findings

01

Emoji removal improves sarcasm detection accuracy.

02

Enhanced AraBERT models set new benchmarks for Arabic NLP.

03

Insights for social media sarcasm analysis in Arabic.

Abstract

The complex challenge of detecting sarcasm in Arabic speech on social media is increased by the language diversity and the nature of sarcastic expressions. There is a significant gap in the capability of existing models to effectively interpret sarcasm in Arabic, which mandates the necessity for more sophisticated and precise detection methods. In this paper, we investigate the impact of a fundamental preprocessing component on sarcasm speech detection. While emojis play a crucial role in mitigating the absence effect of body language and facial expressions in modern communication, their impact on automated text analysis, particularly in sarcasm detection, remains underexplored. We investigate the impact of emoji exclusion from datasets on the performance of sarcasm detection models in social media content for Arabic as a vocabulary-super rich language. This investigation includes the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText Readability and Simplification · Natural Language Processing Techniques · Sentiment Analysis and Opinion Mining