Detecting influenza outbreaks by analyzing Twitter messages

Aron Culotta

arXiv:1007.4748·cs.IR·July 28, 2010·93 cites

Detecting influenza outbreaks by analyzing Twitter messages

Aron Culotta

PDF

Open Access

TL;DR

This paper demonstrates that analyzing Twitter messages with targeted keywords can accurately forecast influenza outbreaks, and introduces a classifier to reduce false alarms caused by irrelevant messages.

Contribution

The study presents a novel approach combining keyword analysis and document classification to improve influenza outbreak detection from social media data.

Findings

01

High correlation (95%) with official health statistics.

02

Document classifier reduces false alarm errors by over 50%.

03

Robustness to spurious keywords is improved with filtering techniques.

Abstract

We analyze over 500 million Twitter messages from an eight month period and find that tracking a small number of flu-related keywords allows us to forecast future influenza rates with high accuracy, obtaining a 95% correlation with national health statistics. We then analyze the robustness of this approach to spurious keyword matches, and we propose a document classification component to filter these misleading messages. We find that this document classifier can reduce error rates by over half in simulated false alarm experiments, though more research is needed to develop methods that are robust in cases of extremely high noise.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Data-Driven Disease Surveillance · Influenza Virus Research Studies