Combining Search, Social Media, and Traditional Data Sources to Improve Influenza Surveillance
Mauricio Santillana, Andre T. Nguyen, Mark Dredze, Michael J. Paul,, John S. Brownstein

TL;DR
This paper introduces a machine learning ensemble approach that combines diverse data sources like social media, search queries, and hospital records to improve real-time and forecast influenza activity predictions in the US, outperforming individual data source models.
Contribution
The novel contribution is the integration of multiple influenza-like illness data sources into a single ensemble prediction model for enhanced accuracy and early forecasting.
Findings
Ensemble predictions outperform individual data source models.
The methodology predicts up to four weeks ahead with high accuracy.
Incorporating social media and crowd-sourced data improves influenza forecasts.
Abstract
We present a machine learning-based methodology capable of providing real-time ("nowcast") and forecast estimates of influenza activity in the US by leveraging data from multiple data sources including: Google searches, Twitter microblogs, nearly real-time hospital visit records, and data from a participatory surveillance system. Our main contribution consists of combining multiple influenza-like illnesses (ILI) activity estimates, generated independently with each data source, into a single prediction of ILI utilizing machine learning ensemble approaches. Our methodology exploits the information in each data source and produces accurate weekly ILI predictions for up to four weeks ahead of the release of CDC's ILI reports. We evaluate the predictive ability of our ensemble approach during the 2013-2014 (retrospective) and 2014-2015 (live) flu seasons for each of the four weekly time…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
