# What Are People Tweeting about Zika? An Exploratory Study Concerning   Symptoms, Treatment, Transmission, and Prevention

**Authors:** Michele Miller, Dr. Tanvi Banerjee, RoopTeja Muppalla, Dr. William, Romine, Dr. Amit Sheth

arXiv: 1701.07490 · 2017-01-27

## TL;DR

This study analyzes over 1.2 million tweets about Zika to identify key topics and misinformation related to symptoms, transmission, prevention, and treatment using NLP and machine learning techniques.

## Contribution

It introduces a two-stage classifier system for identifying and categorizing Zika-related tweets and applies topic modeling to uncover main discussion themes.

## Key findings

- High classifier accuracy for relevancy and disease categories
- Identification of five main topics per disease characteristic
- Potential to detect misinformation for public health response

## Abstract

The purpose of this study was to do a dataset distribution analysis, a classification performance analysis, and a topical analysis concerning what people are tweeting about four disease characteristics: symptoms, transmission, prevention, and treatment. A combination of natural language processing and machine learning techniques were used to determine what people are tweeting about Zika. Specifically, a two-stage classifier system was built to find relevant tweets on Zika, and then categorize these into the four disease categories. Tweets in each disease category were then examined using latent dirichlet allocation (LDA) to determine the five main tweet topics for each disease characteristic. Results 1,234,605 tweets were collected. Tweets by males and females were similar (28% and 23% respectively). The classifier performed well on the training and test data for relevancy (F=0.87 and 0.99 respectively) and disease characteristics (F=0.79 and 0.90 respectively). Five topics for each category were found and discussed with a focus on the symptoms category. Through this process, we demonstrate how misinformation can be discovered so that public health officials can respond to the tweets with misinformation.

---
Source: https://tomesphere.com/paper/1701.07490