VoterFraud2020: a Multi-modal Dataset of Election Fraud Claims on   Twitter

Anton Abilov; Yiqing Hua; Hana Matatov; Ofra Amir; Mor Naaman

arXiv:2101.08210·cs.SI·April 28, 2021

VoterFraud2020: a Multi-modal Dataset of Election Fraud Claims on Twitter

Anton Abilov, Yiqing Hua, Hana Matatov, Ofra Amir, Mor Naaman

PDF

Open Access 1 Repo

TL;DR

This paper introduces VoterFraud2020, a comprehensive multi-modal dataset of 7.6 million tweets related to election fraud claims, enhanced with user, content, and network annotations to facilitate research on misinformation and social media dynamics.

Contribution

The paper presents a large, multi-modal dataset with detailed annotations and cluster labels, enabling diverse research on election-related misinformation on Twitter.

Findings

01

User suspension mainly targeted voter fraud claim promoters.

02

Identified most common URLs, images, and YouTube videos in the dataset.

03

Dataset supports analysis of misinformation dissemination and community behavior.

Abstract

The wide spread of unfounded election fraud claims surrounding the U.S. 2020 election had resulted in undermining of trust in the election, culminating in violence inside the U.S. capitol. Under these circumstances, it is critical to understand the discussions surrounding these claims on Twitter, a major platform where the claims were disseminated. To this end, we collected and released the VoterFraud2020 dataset, a multi-modal dataset with 7.6M tweets and 25.6M retweets from 2.6M users related to voter fraud claims. To make this data immediately useful for a diverse set of research projects, we further enhance the data with cluster labels computed from the retweet graph, each user's suspension status, and the perceptual hashes of tweeted images. The dataset also includes aggregate data for all external links and YouTube videos that appear in the tweets. Preliminary analyses of the data…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sTechLab/VoterFraud2020
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection · Internet Traffic Analysis and Secure E-voting · Social Media and Politics