UrduFake@FIRE2020: Shared Track on Fake News Identification in Urdu
Maaz Amjad, Grigori Sidorov, Alisa Zhila, Alexander Gelbukh, Paolo, Rosso

TL;DR
This paper reports on a shared task for fake news detection in Urdu, involving 42 teams and highlighting the effectiveness of BERT-based models with an F-score of 0.90.
Contribution
It introduces the first Urdu fake news detection shared task and dataset, demonstrating the superiority of BERT-based models over traditional methods.
Findings
BERT-based models achieved the highest F-score of 0.90.
The dataset covers five diverse news domains.
Multiple machine learning approaches were evaluated.
Abstract
This paper gives the overview of the first shared task at FIRE 2020 on fake news detection in the Urdu language. This is a binary classification task in which the goal is to identify fake news using a dataset composed of 900 annotated news articles for training and 400 news articles for testing. The dataset contains news in five domains: (i) Health, (ii) Sports, (iii) Showbiz, (iv) Technology, and (v) Business. 42 teams from 6 different countries (India, China, Egypt, Germany, Pakistan, and the UK) registered for the task. 9 teams submitted their experimental results. The participants used various machine learning methods ranging from feature-based traditional machine learning to neural network techniques. The best performing system achieved an F-score value of 0.90, showing that the BERT-based approach outperforms other machine learning classifiers.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Spam and Phishing Detection · Sentiment Analysis and Opinion Mining
