Fighting an Infodemic: COVID-19 Fake News Dataset
Parth Patwa, Shivam Sharma, Srinivas Pykl, Vineeth Guptha, Gitanjali, Kumari, Md Shad Akhtar, Asif Ekbal, Amitava Das, Tanmoy Chakraborty

TL;DR
This paper introduces a manually annotated dataset of 10,700 COVID-19 related social media posts and articles, and benchmarks machine learning models for fake news detection, achieving up to 93.46% F1-score with SVM.
Contribution
It provides a new, publicly available dataset for COVID-19 fake news detection and evaluates baseline machine learning models on this dataset.
Findings
SVM achieved 93.46% F1-score
Dataset includes 10,700 annotated posts and articles
Benchmark results establish baselines for future research
Abstract
Along with COVID-19 pandemic we are also fighting an `infodemic'. Fake news and rumors are rampant on social media. Believing in rumors can cause significant harm. This is further exacerbated at the time of a pandemic. To tackle this, we curate and release a manually annotated dataset of 10,700 social media posts and articles of real and fake news on COVID-19. We benchmark the annotated dataset with four machine learning baselines - Decision Tree, Logistic Regression, Gradient Boost, and Support Vector Machine (SVM). We obtain the best performance of 93.46% F1-score with SVM. The data and code is available at: https://github.com/parthpatwa/covid19-fake-news-dectection
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSupport Vector Machine · Logistic Regression
