ArPanEmo: An Open-Source Dataset for Fine-Grained Emotion Recognition in Arabic Online Content during COVID-19 Pandemic
Maha Jarallah Althobaiti

TL;DR
This paper introduces ArPanEmo, a large, manually labeled Arabic dataset of online posts during COVID-19, enabling fine-grained emotion recognition for Arabic NLP applications.
Contribution
The creation of the first and largest Arabic dataset for fine-grained emotion recognition in COVID-19 related online content, with manual labeling and high inter-annotator agreement.
Findings
Dataset contains 11,128 posts with 10 emotion categories.
Achieved Fleiss' kappa of 0.71 indicating substantial agreement.
Addresses a gap in Arabic emotion recognition resources.
Abstract
Emotion recognition is a crucial task in Natural Language Processing (NLP) that enables machines to comprehend the feelings conveyed in the text. The applications of emotion recognition are diverse, including mental health diagnosis, student support, and the detection of online suspicious behavior. Despite the substantial amount of literature available on emotion recognition in various languages, Arabic emotion recognition has received relatively little attention, leading to a scarcity of emotion-annotated corpora. This paper presents the ArPanEmo dataset, a novel dataset for fine-grained emotion recognition of online posts in Arabic. The dataset comprises 11,128 online posts manually labeled for ten emotion categories or neutral, with Fleiss' kappa of 0.71. It targets a specific Arabic dialect and addresses topics related to the COVID-19 pandemic, making it the first and largest of its…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Mental Health via Writing · Emotion and Mood Recognition
