Constraint 2021: Machine Learning Models for COVID-19 Fake News   Detection Shared Task

Thomas Felber

arXiv:2101.03717·cs.CL·January 14, 2021·22 cites

Constraint 2021: Machine Learning Models for COVID-19 Fake News Detection Shared Task

Thomas Felber

PDF

Open Access

TL;DR

This paper presents a machine learning approach using linguistic features and classical algorithms to detect COVID-19 fake news on social media, achieving a competitive F1 score in a shared task.

Contribution

The authors develop a system combining linguistic features with classical ML algorithms, notably a linear SVM, for COVID-19 fake news detection in a shared task setting.

Findings

01

Best system is a linear SVM with 95.19% F1 score

02

Linguistic features improve classification performance

03

System ranks 80th out of 167 in the leaderboard

Abstract

In this system paper we present our contribution to the Constraint 2021 COVID-19 Fake News Detection Shared Task, which poses the challenge of classifying COVID-19 related social media posts as either fake or real. In our system, we address this challenge by applying classical machine learning algorithms together with several linguistic features, such as n-grams, readability, emotional tone and punctuation. In terms of pre-processing, we experiment with various steps like stop word removal, stemming/lemmatization, link removal and more. We find our best performing system to be based on a linear SVM, which obtains a weighted average F1 score of 95.19% on test data, which lands a place in the middle of the leaderboard (place 80 of 167).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Topic Modeling · Sentiment Analysis and Opinion Mining

MethodsSupport Vector Machine