Fighting an Infodemic: COVID-19 Fake News Dataset

Parth Patwa; Shivam Sharma; Srinivas Pykl; Vineeth Guptha; Gitanjali; Kumari; Md Shad Akhtar; Asif Ekbal; Amitava Das; Tanmoy Chakraborty

arXiv:2011.03327·cs.CL·May 27, 2021

Fighting an Infodemic: COVID-19 Fake News Dataset

Parth Patwa, Shivam Sharma, Srinivas Pykl, Vineeth Guptha, Gitanjali, Kumari, Md Shad Akhtar, Asif Ekbal, Amitava Das, Tanmoy Chakraborty

PDF

2 Repos 1 Datasets

TL;DR

This paper introduces a manually annotated dataset of 10,700 COVID-19 related social media posts and articles, and benchmarks machine learning models for fake news detection, achieving up to 93.46% F1-score with SVM.

Contribution

It provides a new, publicly available dataset for COVID-19 fake news detection and evaluates baseline machine learning models on this dataset.

Findings

01

SVM achieved 93.46% F1-score

02

Dataset includes 10,700 annotated posts and articles

03

Benchmark results establish baselines for future research

Abstract

Along with COVID-19 pandemic we are also fighting an `infodemic'. Fake news and rumors are rampant on social media. Believing in rumors can cause significant harm. This is further exacerbated at the time of a pandemic. To tackle this, we curate and release a manually annotated dataset of 10,700 social media posts and articles of real and fake news on COVID-19. We benchmark the annotated dataset with four machine learning baselines - Decision Tree, Logistic Regression, Gradient Boost, and Support Vector Machine (SVM). We obtain the best performance of 93.46% F1-score with SVM. The data and code is available at: https://github.com/parthpatwa/covid19-fake-news-dectection

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Datasets

nanyy1025/covid_fake_news
dataset· 410 dl
410 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSupport Vector Machine · Logistic Regression