Debunking Fake News One Feature at a Time

Melanie Tosik; Antonio Mallia; Kedar Gangopadhyay

arXiv:1808.02831·cs.CL·August 9, 2018·5 cites

Debunking Fake News One Feature at a Time

Melanie Tosik, Antonio Mallia, Kedar Gangopadhyay

PDF

Open Access 1 Repo

TL;DR

This paper presents a two-stage ensemble model using hand-crafted features and gradient boosting for fake news stance detection, achieving competitive accuracy and analyzing feature importance and sampling techniques.

Contribution

The paper introduces a novel 2-stage ensemble approach with hand-crafted features for stance detection in fake news, emphasizing feature importance and data sampling strategies.

Findings

01

Achieved 78.63% accuracy on Fake News Challenge dataset.

02

Identified key features for fake news detection.

03

Discussed sampling techniques to improve recall.

Abstract

Identifying the stance of a news article body with respect to a certain headline is the first step to automated fake news detection. In this paper, we introduce a 2-stage ensemble model to solve the stance detection task. By using only hand-crafted features as input to a gradient boosting classifier, we are able to achieve a score of 9161.5 out of 11651.25 (78.63%) on the official Fake News Challenge (Stage 1) dataset. We identify the most useful features for detecting fake news and discuss how sampling techniques can be used to improve recall accuracy on a highly imbalanced dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

NYU-FNC/FakeNewsChallenge
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Spam and Phishing Detection · Topic Modeling