What Are People Tweeting about Zika? An Exploratory Study Concerning Symptoms, Treatment, Transmission, and Prevention
Michele Miller, Dr. Tanvi Banerjee, RoopTeja Muppalla, Dr. William, Romine, Dr. Amit Sheth

TL;DR
This study analyzes over 1.2 million tweets about Zika to identify key topics and misinformation related to symptoms, transmission, prevention, and treatment using NLP and machine learning techniques.
Contribution
It introduces a two-stage classifier system for identifying and categorizing Zika-related tweets and applies topic modeling to uncover main discussion themes.
Findings
High classifier accuracy for relevancy and disease categories
Identification of five main topics per disease characteristic
Potential to detect misinformation for public health response
Abstract
The purpose of this study was to do a dataset distribution analysis, a classification performance analysis, and a topical analysis concerning what people are tweeting about four disease characteristics: symptoms, transmission, prevention, and treatment. A combination of natural language processing and machine learning techniques were used to determine what people are tweeting about Zika. Specifically, a two-stage classifier system was built to find relevant tweets on Zika, and then categorize these into the four disease categories. Tweets in each disease category were then examined using latent dirichlet allocation (LDA) to determine the five main tweet topics for each disease characteristic. Results 1,234,605 tweets were collected. Tweets by males and females were similar (28% and 23% respectively). The classifier performed well on the training and test data for relevancy (F=0.87 and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Data-Driven Disease Surveillance
