COVID-19 Vaccine Hesitancy on Social Media: Building a Public Twitter   Dataset of Anti-vaccine Content, Vaccine Misinformation and Conspiracies

Goran Muric; Yusong Wu; Emilio Ferrara

arXiv:2105.05134·cs.SI·May 18, 2021

COVID-19 Vaccine Hesitancy on Social Media: Building a Public Twitter Dataset of Anti-vaccine Content, Vaccine Misinformation and Conspiracies

Goran Muric, Yusong Wu, Emilio Ferrara

PDF

1 Repo

TL;DR

This paper introduces a comprehensive Twitter dataset of over 137 million tweets related to COVID-19 vaccine misinformation and anti-vaccine content, facilitating research on vaccine hesitancy and misinformation spread.

Contribution

The paper provides the first large-scale, publicly available Twitter dataset focused on anti-vaccine content, including both real-time and historical data, with detailed descriptive analyses.

Findings

01

High volume of anti-vaccine tweets over time

02

Geographical distribution of misinformation spread

03

Identified political leanings of spreading accounts

Abstract

False claims about COVID-19 vaccines can undermine public trust in ongoing vaccination campaigns, thus posing a threat to global public health. Misinformation originating from various sources has been spreading online since the beginning of the COVID-19 pandemic. In this paper, we present a dataset of Twitter posts that exhibit a strong anti-vaccine stance. The dataset consists of two parts: a) a streaming keyword-centered data collection with more than 1.8 million tweets, and b) a historical account-level collection with more than 135 million tweets. The former leverages the Twitter streaming API to follow a set of specific vaccine-related keywords starting from mid-October 2020. The latter consists of all historical tweets of 70K accounts that were engaged in the active spreading of anti-vaccine narratives. We present descriptive analyses showing the volume of activity over time,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gmuric/avax-tweets-dataset
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.