MiDe22: An Annotated Multi-Event Tweet Dataset for Misinformation   Detection

Cagri Toraman; Oguzhan Ozcelik; Furkan \c{S}ahinu\c{c}; Fazli Can

arXiv:2210.05401·cs.SI·July 12, 2024·6 cites

MiDe22: An Annotated Multi-Event Tweet Dataset for Misinformation Detection

Cagri Toraman, Oguzhan Ozcelik, Furkan \c{S}ahinu\c{c}, Fazli Can

PDF

Open Access 2 Repos 1 Datasets

TL;DR

This paper introduces MiDe22, a comprehensive multilingual tweet dataset annotated for misinformation, covering recent global events, and provides baseline evaluations for misinformation detection.

Contribution

The creation of MiDe22, a large annotated dataset for misinformation detection in English and Turkish tweets related to recent major events.

Findings

01

Dataset includes 10,348 tweets with engagement metrics.

02

Provides detailed data analysis and descriptive statistics.

03

Benchmark results demonstrate baseline performance for misinformation detection.

Abstract

The rapid dissemination of misinformation through online social networks poses a pressing issue with harmful consequences jeopardizing human health, public safety, democracy, and the economy; therefore, urgent action is required to address this problem. In this study, we construct a new human-annotated dataset, called MiDe22, having 5,284 English and 5,064 Turkish tweets with their misinformation labels for several recent events between 2020 and 2022, including the Russia-Ukraine war, COVID-19 pandemic, and Refugees. The dataset includes user engagements with the tweets in terms of likes, replies, retweets, and quotes. We also provide a detailed data analysis with descriptive statistics and the experimental results of a benchmark evaluation for misinformation detection.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Datasets

ctoraman/misinformation-detection-tweets
dataset· 103 dl
103 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Spam and Phishing Detection · Hate Speech and Cyberbullying Detection