Estimating SARS-CoV-2 Infections from Deaths, Confirmed Cases, Tests, and Random Surveys
Nicholas J. Irons, Adrian E. Raftery

TL;DR
This paper presents a Bayesian model combining multiple data sources to estimate true SARS-CoV-2 infection prevalence, addressing biases and delays inherent in individual data streams, and calibrates estimates with random survey data from Indiana and Ohio.
Contribution
It introduces a simple Bayesian framework that integrates death, case, testing, and survey data to accurately estimate infection prevalence over time.
Findings
Reported cases significantly underestimate true infections.
The model estimates infection fatality rate and daily new infections.
Results inform understanding of herd immunity progress.
Abstract
There are many sources of data giving information about the number of SARS-CoV-2 infections in the population, but all have major drawbacks, including biases and delayed reporting. For example, the number of confirmed cases largely underestimates the number of infections, deaths lag infections substantially, while test positivity rates tend to greatly overestimate prevalence. Representative random prevalence surveys, the only putatively unbiased source, are sparse in time and space, and the results come with a big delay. Reliable estimates of population prevalence are necessary for understanding the spread of the virus and the effects of mitigation strategies. We develop a simple Bayesian framework to estimate viral prevalence by combining the main available data sources. It is based on a discrete-time SIR model with time-varying reproductive parameter. Our model includes likelihood…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
