MultiProSE: A Multi-label Arabic Dataset for Propaganda, Sentiment, and Emotion Detection
Lubna Al-Henaki, Hend Al-Khalifa, Abdulmalik Al-Salman, Hajar, Alqubayshi, Hind Al-Twailay, Gheeda Alghamdi, Hawra Aljasim

TL;DR
MultiProSE is the first large-scale Arabic dataset combining propaganda, sentiment, and emotion annotations, enabling advanced research in Arabic NLP and opinion analysis.
Contribution
It introduces the largest Arabic propaganda dataset with multi-label annotations, including sentiment and emotion, and provides baseline models using LLMs and PLMs.
Findings
Largest Arabic propaganda dataset to date
Baseline models demonstrate effective multi-label classification
Open-source resources facilitate future Arabic NLP research
Abstract
Propaganda is a form of persuasion that has been used throughout history with the intention goal of influencing people's opinions through rhetorical and psychological persuasion techniques for determined ends. Although Arabic ranked as the fourth most-used language on the internet, resources for propaganda detection in languages other than English, especially Arabic, remain extremely limited. To address this gap, the first Arabic dataset for Multi-label Propaganda, Sentiment, and Emotion (MultiProSE) has been introduced. MultiProSE is an open-source extension of the existing Arabic propaganda dataset, ArPro, with the addition of sentiment and emotion annotations for each text. This dataset comprises 8,000 annotated news articles, which is the largest propaganda dataset to date. For each task, several baselines have been developed using large language models (LLMs), such as GPT-4o-mini,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Misinformation and Its Impacts · Hate Speech and Cyberbullying Detection
