Debunking Fake News One Feature at a Time
Melanie Tosik, Antonio Mallia, Kedar Gangopadhyay

TL;DR
This paper presents a two-stage ensemble model using hand-crafted features and gradient boosting for fake news stance detection, achieving competitive accuracy and analyzing feature importance and sampling techniques.
Contribution
The paper introduces a novel 2-stage ensemble approach with hand-crafted features for stance detection in fake news, emphasizing feature importance and data sampling strategies.
Findings
Achieved 78.63% accuracy on Fake News Challenge dataset.
Identified key features for fake news detection.
Discussed sampling techniques to improve recall.
Abstract
Identifying the stance of a news article body with respect to a certain headline is the first step to automated fake news detection. In this paper, we introduce a 2-stage ensemble model to solve the stance detection task. By using only hand-crafted features as input to a gradient boosting classifier, we are able to achieve a score of 9161.5 out of 11651.25 (78.63%) on the official Fake News Challenge (Stage 1) dataset. We identify the most useful features for detecting fake news and discuss how sampling techniques can be used to improve recall accuracy on a highly imbalanced dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Spam and Phishing Detection · Topic Modeling
