ThatiAR: Subjectivity Detection in Arabic News Sentences
Reem Suwaileh, Maram Hasanain, Fatema Hubail, Wajdi Zaghouani, Firoj, Alam

TL;DR
This paper introduces the first large Arabic dataset for subjectivity detection in news sentences, analyzes annotation challenges, and benchmarks various language models, highlighting the effectiveness of LLMs with in-context learning.
Contribution
It provides a new Arabic dataset for subjectivity detection, includes detailed analysis of annotation influences, and evaluates multiple models, emphasizing LLMs' superior performance.
Findings
LLMs with in-context learning outperform other models
Annotators' backgrounds significantly influence annotation quality
The dataset facilitates future research in Arabic NLP
Abstract
Detecting subjectivity in news sentences is crucial for identifying media bias, enhancing credibility, and combating misinformation by flagging opinion-based content. It provides insights into public sentiment, empowers readers to make informed decisions, and encourages critical thinking. While research has developed methods and systems for this purpose, most efforts have focused on English and other high-resourced languages. In this study, we present the first large dataset for subjectivity detection in Arabic, consisting of ~3.6K manually annotated sentences, and GPT-4o based explanation. In addition, we included instructions (both in English and Arabic) to facilitate LLM based fine-tuning. We provide an in-depth analysis of the dataset, annotation process, and extensive benchmark results, including PLMs and LLMs. Our analysis of the annotation process highlights that annotators were…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Sentiment Analysis and Opinion Mining
