# "Liar, Liar Pants on Fire": A New Benchmark Dataset for Fake News   Detection

**Authors:** William Yang Wang

arXiv: 1705.00648 · 2017-05-03

## TL;DR

This paper introduces 'liar', a large, publicly available dataset of 12.8K labeled statements for fake news detection, and demonstrates a hybrid CNN model that improves detection accuracy by integrating meta-data with text.

## Contribution

The paper provides a significantly larger fake news dataset and proposes a novel hybrid CNN model that combines meta-data with text for improved detection.

## Key findings

- Hybrid CNN outperforms text-only models
- Dataset is an order of magnitude larger than previous datasets
- Meta-data integration improves fake news detection accuracy

## Abstract

Automatic fake news detection is a challenging problem in deception detection, and it has tremendous real-world political and social impacts. However, statistical approaches to combating fake news has been dramatically limited by the lack of labeled benchmark datasets. In this paper, we present liar: a new, publicly available dataset for fake news detection. We collected a decade-long, 12.8K manually labeled short statements in various contexts from PolitiFact.com, which provides detailed analysis report and links to source documents for each case. This dataset can be used for fact-checking research as well. Notably, this new dataset is an order of magnitude larger than previously largest public fake news datasets of similar type. Empirically, we investigate automatic fake news detection based on surface-level linguistic patterns. We have designed a novel, hybrid convolutional neural network to integrate meta-data with text. We show that this hybrid approach can improve a text-only deep learning model.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1705.00648/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/1705.00648/full.md

## References

16 references — full list in the complete paper: https://tomesphere.com/paper/1705.00648/full.md

---
Source: https://tomesphere.com/paper/1705.00648