AraStance: A Multi-Country and Multi-Domain Dataset of Arabic Stance Detection for Fact Checking
Tariq Alhindi, Amal Alabdulkarim, Ali Alshehri, Muhammad Abdul-Mageed, and Preslav Nakov

TL;DR
This paper introduces AraStance, a comprehensive Arabic stance detection dataset from multiple countries and domains, and benchmarks it with BERT models, highlighting its challenges and potential for improving fact-checking systems.
Contribution
The paper presents AraStance, a new large-scale, multi-domain Arabic stance detection dataset, and provides baseline results with BERT models to advance Arabic fact-checking research.
Findings
Best BERT model achieves 85% accuracy
Dataset covers diverse domains and countries
Stance detection remains a challenging task
Abstract
With the continuing spread of misinformation and disinformation online, it is of increasing importance to develop combating mechanisms at scale in the form of automated systems that support multiple languages. One task of interest is claim veracity prediction, which can be addressed using stance detection with respect to relevant documents retrieved online. To this end, we present our new Arabic Stance Detection dataset (AraStance) of 4,063 claim--article pairs from a diverse set of sources comprising three fact-checking websites and one news website. AraStance covers false and true claims from multiple domains (e.g., politics, sports, health) and several Arab countries, and it is well-balanced between related and unrelated documents with respect to the claims. We benchmark AraStance, along with two other stance detection datasets, using a number of BERT-based models. Our best model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
