Detecting COVID-19 Conspiracy Theories with Transformers and TF-IDF
Haoming Guo, Tianyi Huang, Huixuan Huang, Mingyue Fan, Gerald, Friedland

TL;DR
This paper explores machine learning techniques, including transformers and TF-IDF, to detect COVID-19 related fake news, emphasizing domain-specific vocabulary and rapid topic changes, with transformer models showing superior performance.
Contribution
It introduces methods for COVID-19 fake news detection using transformers and TF-IDF, highlighting the effectiveness of pre-trained transformers and the potential of trained transformers with smart design.
Findings
Pre-trained transformers achieve the best validation accuracy.
Trained transformers with smart design can approach pre-trained transformer performance.
Support Vector Machines and Random Forest perform less effectively.
Abstract
The sharing of fake news and conspiracy theories on social media has wide-spread negative effects. By designing and applying different machine learning models, researchers have made progress in detecting fake news from text. However, existing research places a heavy emphasis on general, common-sense fake news, while in reality fake news often involves rapidly changing topics and domain-specific vocabulary. In this paper, we present our methods and results for three fake news detection tasks at MediaEval benchmark 2021 that specifically involve COVID-19 related topics. We experiment with a group of text-based models including Support Vector Machines, Random Forest, BERT, and RoBERTa. We find that a pre-trained transformer yields the best validation results, but a randomly initialized transformer with smart design can also be trained to reach accuracies close to that of the pre-trained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Spam and Phishing Detection · Hate Speech and Cyberbullying Detection
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Residual Connection · Multi-Head Attention · Layer Normalization · Attention Dropout · Softmax · Dense Connections · Weight Decay
