Detecting COVID-19 Conspiracy Theories with Transformers and TF-IDF

Haoming Guo; Tianyi Huang; Huixuan Huang; Mingyue Fan; Gerald; Friedland

arXiv:2205.00377·cs.CL·May 3, 2022·1 cites

Detecting COVID-19 Conspiracy Theories with Transformers and TF-IDF

Haoming Guo, Tianyi Huang, Huixuan Huang, Mingyue Fan, Gerald, Friedland

PDF

Open Access

TL;DR

This paper explores machine learning techniques, including transformers and TF-IDF, to detect COVID-19 related fake news, emphasizing domain-specific vocabulary and rapid topic changes, with transformer models showing superior performance.

Contribution

It introduces methods for COVID-19 fake news detection using transformers and TF-IDF, highlighting the effectiveness of pre-trained transformers and the potential of trained transformers with smart design.

Findings

01

Pre-trained transformers achieve the best validation accuracy.

02

Trained transformers with smart design can approach pre-trained transformer performance.

03

Support Vector Machines and Random Forest perform less effectively.

Abstract

The sharing of fake news and conspiracy theories on social media has wide-spread negative effects. By designing and applying different machine learning models, researchers have made progress in detecting fake news from text. However, existing research places a heavy emphasis on general, common-sense fake news, while in reality fake news often involves rapidly changing topics and domain-specific vocabulary. In this paper, we present our methods and results for three fake news detection tasks at MediaEval benchmark 2021 that specifically involve COVID-19 related topics. We experiment with a group of text-based models including Support Vector Machines, Random Forest, BERT, and RoBERTa. We find that a pre-trained transformer yields the best validation results, but a randomly initialized transformer with smart design can also be trained to reach accuracies close to that of the pre-trained…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Spam and Phishing Detection · Hate Speech and Cyberbullying Detection

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Residual Connection · Multi-Head Attention · Layer Normalization · Attention Dropout · Softmax · Dense Connections · Weight Decay