Detecting Influenza Epidemics on Twitter
Katerina Katsani-Geronymaki, Polyvios Pratikakis

TL;DR
This paper develops a Twitter-based predictive model for influenza outbreaks by analyzing keyword frequencies, correlating them with official health data, and integrating the predictions into an epidemic phase model, enabling early warnings.
Contribution
It introduces a novel approach combining Twitter data analysis with epidemic modeling to improve early detection of influenza outbreaks.
Findings
Twitter data correlates strongly with official ILI data.
The model provides earlier epidemic warnings than traditional sentinel systems.
Twitter-based predictions are effective for real-time epidemic monitoring.
Abstract
This paper presents a predictive model for Influenza-Like-Illness, based on Twitter traffic. We gather data from Twitter based on a set of keywords used in the Influenza wikipedia page, and perform feature selection over all words used in 3 years worth of tweets, using real ILI data from the Greek CDC. We select a small set of words with high correlation to the ILI score, and train a regression model to predict the ILI score cases from the word features. We deploy this model on a streaming application and feed the resulting time-series to FluHMM, an existing prediction model for the phases of the epidemic. We find that Twitter traffic offers a good source of information and can generate early warnings compared to the existing sentinel protocol using a set of associated physicians all over Greece.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData-Driven Disease Surveillance · Influenza Virus Research Studies · Misinformation and Its Impacts
