MultiProSE: A Multi-label Arabic Dataset for Propaganda, Sentiment, and   Emotion Detection

Lubna Al-Henaki; Hend Al-Khalifa; Abdulmalik Al-Salman; Hajar; Alqubayshi; Hind Al-Twailay; Gheeda Alghamdi; Hawra Aljasim

arXiv:2502.08319·cs.CL·February 21, 2025

MultiProSE: A Multi-label Arabic Dataset for Propaganda, Sentiment, and Emotion Detection

Lubna Al-Henaki, Hend Al-Khalifa, Abdulmalik Al-Salman, Hajar, Alqubayshi, Hind Al-Twailay, Gheeda Alghamdi, Hawra Aljasim

PDF

Open Access

TL;DR

MultiProSE is the first large-scale Arabic dataset combining propaganda, sentiment, and emotion annotations, enabling advanced research in Arabic NLP and opinion analysis.

Contribution

It introduces the largest Arabic propaganda dataset with multi-label annotations, including sentiment and emotion, and provides baseline models using LLMs and PLMs.

Findings

01

Largest Arabic propaganda dataset to date

02

Baseline models demonstrate effective multi-label classification

03

Open-source resources facilitate future Arabic NLP research

Abstract

Propaganda is a form of persuasion that has been used throughout history with the intention goal of influencing people's opinions through rhetorical and psychological persuasion techniques for determined ends. Although Arabic ranked as the fourth most-used language on the internet, resources for propaganda detection in languages other than English, especially Arabic, remain extremely limited. To address this gap, the first Arabic dataset for Multi-label Propaganda, Sentiment, and Emotion (MultiProSE) has been introduced. MultiProSE is an open-source extension of the existing Arabic propaganda dataset, ArPro, with the addition of sentiment and emotion annotations for each text. This dataset comprises 8,000 annotated news articles, which is the largest propaganda dataset to date. For each task, several baselines have been developed using large language models (LLMs), such as GPT-4o-mini,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSentiment Analysis and Opinion Mining · Misinformation and Its Impacts · Hate Speech and Cyberbullying Detection