Detecting influenza outbreaks by analyzing Twitter messages
Aron Culotta

TL;DR
This paper demonstrates that analyzing Twitter messages with targeted keywords can accurately forecast influenza outbreaks, and introduces a classifier to reduce false alarms caused by irrelevant messages.
Contribution
The study presents a novel approach combining keyword analysis and document classification to improve influenza outbreak detection from social media data.
Findings
High correlation (95%) with official health statistics.
Document classifier reduces false alarm errors by over 50%.
Robustness to spurious keywords is improved with filtering techniques.
Abstract
We analyze over 500 million Twitter messages from an eight month period and find that tracking a small number of flu-related keywords allows us to forecast future influenza rates with high accuracy, obtaining a 95% correlation with national health statistics. We then analyze the robustness of this approach to spurious keyword matches, and we propose a document classification component to filter these misleading messages. We find that this document classifier can reduce error rates by over half in simulated false alarm experiments, though more research is needed to develop methods that are robust in cases of extremely high noise.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Data-Driven Disease Surveillance · Influenza Virus Research Studies
