When Bayes goes bad: Weakly-regularized covariate adjustment leads to a biased estimate of prevalence
Swen Kuh, Lauren Kennedy, Qixuan Chen, Andrew Gelman

TL;DR
This paper investigates how weak regularization in Bayesian models for prevalence estimation can cause biased results, especially when adjusting for multiple covariates, demonstrated through a real case and simulation study.
Contribution
It reveals the bias caused by weak priors in Bayesian covariate adjustment models and offers practical recommendations and a simulation framework for understanding this issue.
Findings
Weak priors increase partial pooling, biasing prevalence estimates downward.
Adding covariates sequentially can lead to unexpected bias in Bayesian models.
Simulation studies help identify contributors to bias in complex prevalence estimation models.
Abstract
When estimating population prevalence from a non-random sample, it is important to adjust for differences between sample and population. However, adjustment for multiple factors requires analysis that can be difficult to understand and validate. In this manuscript, we explore an unexpected downward trend of estimates when covariates are added sequentially to a Bayesian hierarchical model for the estimation of the prevalence of SARS-CoV-2 specific antibodies in an Australian city in late 2020. We compare our data analysis to results from a simulation study to understand four potential contributors to this effect: (i) correction for differences between sample and population, (ii) rare-events bias in logistic regression, (iii) inclusion of the uncertainty of test sensitivity and specificity in a multilevel model, and (iv) increasing model dimensionality. We find that weak prior…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
