Fake News Detection: It's All in the Data!

Soveatin Kuntur; Anna Wr\'oblewska; Marcin Paprzycki; Maria Ganzha

arXiv:2407.02122·cs.CL·February 5, 2026

Fake News Detection: It's All in the Data!

Soveatin Kuntur, Anna Wr\'oblewska, Marcin Paprzycki, Maria Ganzha

PDF

2 Repos

TL;DR

This survey emphasizes the importance of dataset quality and diversity in fake news detection, reviewing datasets, labeling, biases, and ethical issues, and providing a consolidated GitHub repository to support future research.

Contribution

It offers a comprehensive overview of datasets and best practices in fake news detection and introduces a unified GitHub repository for accessible data resources.

Findings

01

Highlights key features and biases of datasets

02

Addresses ethical issues in fake news detection

03

Provides a consolidated dataset repository

Abstract

This comprehensive survey serves as an indispensable resource for researchers embarking on the journey of fake news detection. By highlighting the pivotal role of dataset quality and diversity, it underscores the significance of these elements in the effectiveness and robustness of detection models. The survey meticulously outlines the key features of datasets, various labeling systems employed, and prevalent biases that can impact model performance. Additionally, it addresses critical ethical issues and best practices, offering a thorough overview of the current state of available datasets. Our contribution to this field is further enriched by the provision of GitHub repository, which consolidates publicly accessible datasets into a single, user-friendly portal. This repository is designed to facilitate and stimulate further research and development efforts aimed at combating the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.