EmoHopeSpeech: An Annotated Dataset of Emotions and Hope Speech in English and Arabic
Wajdi Zaghouani, Md. Rafiul Biswas

TL;DR
This paper presents EmoHopeSpeech, a bilingual dataset of over 33,000 entries annotated for emotions and hope speech in English and Arabic, facilitating research in multi-emotion NLP tasks.
Contribution
It introduces a large, reliably annotated bilingual dataset capturing emotions and hope speech, addressing a significant gap in multi-emotion NLP resources for underrepresented languages.
Findings
High annotation reliability with Fleiss' Kappa of 0.75-0.85
Baseline model achieved micro-F1-Score of 0.67
Dataset enables cross-linguistic emotion and hope speech analysis
Abstract
This research introduces a bilingual dataset comprising 23,456 entries for Arabic and 10,036 entries for English, annotated for emotions and hope speech, addressing the scarcity of multi-emotion (Emotion and hope) datasets. The dataset provides comprehensive annotations capturing emotion intensity, complexity, and causes, alongside detailed classifications and subcategories for hope speech. To ensure annotation reliability, Fleiss' Kappa was employed, revealing 0.75-0.85 agreement among annotators both for Arabic and English language. The evaluation metrics (micro-F1-Score=0.67) obtained from the baseline model (i.e., using a machine learning model) validate that the data annotations are worthy. This dataset offers a valuable resource for advancing natural language processing in underrepresented languages, fostering better cross-linguistic analysis of emotions and hope speech.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Mental Health via Writing
