Dataset of Fake News Detection and Fact Verification: A Survey
Taichi Murayama

TL;DR
This survey comprehensively reviews 118 public datasets related to fake news detection and fact verification, highlighting their characteristics, utilization, challenges, and research opportunities to aid future studies.
Contribution
It provides a detailed large-scale overview of datasets for fake news research, aiding researchers in selecting suitable resources and addressing dataset construction challenges.
Findings
Reviewed 118 datasets across multiple fake news tasks
Identified key challenges in dataset construction
Highlighted research opportunities in fake news dataset development
Abstract
The rapid increase in fake news, which causes significant damage to society, triggers many fake news related studies, including the development of fake news detection and fact verification techniques. The resources for these studies are mainly available as public datasets taken from Web data. We surveyed 118 datasets related to fake news research on a large scale from three perspectives: (1) fake news detection, (2) fact verification, and (3) other tasks; for example, the analysis of fake news and satire detection. We also describe in detail their utilization tasks and their characteristics. Finally, we highlight the challenges in the fake news dataset construction and some research opportunities that address these challenges. Our survey facilitates fake news research by helping researchers find suitable datasets without reinventing the wheel, and thereby, improves fake news studies in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Spam and Phishing Detection · Advanced Malware Detection Techniques
