Leveraging Synthetic and Genetic Data to Improve Epidemic Forecasting
Dave Osthus, Alexander C. Murph, Emma E. Goldberg, Lauren J. Beesley, William M. Fischer, Nidhi K. Parikh, Lauren A. Castro

TL;DR
This paper shows that combining synthetic data and genetic information in deep learning models significantly improves the accuracy of COVID-19 outbreak forecasts, providing a valuable approach for early pandemic prediction.
Contribution
It introduces a novel approach of integrating synthetic data and genetic information into epidemic forecasting models, demonstrating improved accuracy over traditional methods.
Findings
Models with synthetic data outperform those with only real data.
Genetic information enhances forecast accuracy.
Several models outperform the COVIDHub-4_week_ensemble.
Abstract
Forecasting infectious disease outbreaks is hard. Forecasting emerging infectious diseases with limited historical data is even harder. In this paper, we investigate ways to improve emerging infectious disease forecasting under operational constraints. Specifically, we explore two options likely to be available near the start of an emerging disease outbreak: synthetic data and genetic information. For this investigation, we conducted an experiment where we trained deep learning models on different combinations of real and synthetic data, both with and without genetic information, to explore how these models compare when forecasting COVID-19 cases for US states. All models are developed with an eye towards forecasting the next pandemic. We find that models trained with synthetic data have better forecast accuracy than models trained on real data alone, and models that use genetic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCOVID-19 epidemiological studies · Zoonotic diseases and public health · Data-Driven Disease Surveillance
