Monant Medical Misinformation Dataset: Mapping Articles to Fact-Checked   Claims

Ivan Srba; Branislav Pecher; Matus Tomlein; Robert Moro; Elena; Stefancova; Jakub Simko; Maria Bielikova

arXiv:2204.12294·cs.CL·April 27, 2022

Monant Medical Misinformation Dataset: Mapping Articles to Fact-Checked Claims

Ivan Srba, Branislav Pecher, Matus Tomlein, Robert Moro, Elena, Stefancova, Jakub Simko, Maria Bielikova

PDF

1 Repo

TL;DR

This paper introduces a large, feature-rich dataset of medical news articles and fact-checked claims to facilitate machine learning research in detecting and analyzing medical misinformation, especially relevant during the COVID-19 pandemic.

Contribution

The creation and release of a comprehensive dataset with manual and automatic mappings between articles and claims, enabling new research in medical misinformation detection and analysis.

Findings

01

Baseline models for claim presence and stance detection evaluated

02

Manual and automatic mappings provide a foundation for misinformation studies

03

Dataset supports research on misinformation characterization and diffusion

Abstract

False information has a significant negative influence on individuals as well as on the whole society. Especially in the current COVID-19 era, we witness an unprecedented growth of medical misinformation. To help tackle this problem with machine learning approaches, we are publishing a feature-rich dataset of approx. 317k medical news articles/blogs and 3.5k fact-checked claims. It also contains 573 manually and more than 51k automatically labelled mappings between claims and articles. Mappings consist of claim presence, i.e., whether a claim is contained in a given article, and article stance towards the claim. We provide several baselines for these two tasks and evaluate them on the manually labelled part of the dataset. The dataset enables a number of additional tasks related to medical misinformation, such as misinformation characterisation studies or studies of misinformation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kinit-sk/medical-misinformation-dataset
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsDiffusion