# Automatically Identifying Fake News in Popular Twitter Threads

**Authors:** Cody Buntain, Jennifer Golbeck

arXiv: 1705.01613 · 2018-06-01

## TL;DR

This paper presents an automated method for detecting fake news on Twitter by leveraging crowdsourced and journalistic credibility assessments, outperforming previous models and providing insights into predictive features.

## Contribution

It introduces a novel approach that combines crowdsourced and journalistic data for fake news detection on Twitter, with models trained on crowdsourced data showing superior performance.

## Key findings

- Crowdsourced models outperform journalist-based models.
- Features aligned with prior research are most predictive.
- Public datasets and models are made available for further research.

## Abstract

Information quality in social media is an increasingly important issue, but web-scale data hinders experts' ability to assess and correct much of the inaccurate content, or `fake news,' present in these platforms. This paper develops a method for automating fake news detection on Twitter by learning to predict accuracy assessments in two credibility-focused Twitter datasets: CREDBANK, a crowdsourced dataset of accuracy assessments for events in Twitter, and PHEME, a dataset of potential rumors in Twitter and journalistic assessments of their accuracies. We apply this method to Twitter content sourced from BuzzFeed's fake news dataset and show models trained against crowdsourced workers outperform models based on journalists' assessment and models trained on a pooled dataset of both crowdsourced workers and journalists. All three datasets, aligned into a uniform format, are also publicly available. A feature analysis then identifies features that are most predictive for crowdsourced and journalistic accuracy assessments, results of which are consistent with prior work. We close with a discussion contrasting accuracy and credibility and why models of non-experts outperform models of journalists for fake news detection in Twitter.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1705.01613/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/1705.01613/full.md

## References

23 references — full list in the complete paper: https://tomesphere.com/paper/1705.01613/full.md

---
Source: https://tomesphere.com/paper/1705.01613