Scalable Bayes under Informative Sampling
Terrance D. Savitsky, Sanvesh Srivastava

TL;DR
This paper introduces a scalable Bayesian method for informative survey sampling that divides data into subsets, performs parallel posterior sampling, and combines results efficiently, maintaining accuracy while reducing computational costs.
Contribution
The paper presents a novel scalable Bayesian approach for informative sampling that uses data partitioning and Wasserstein barycenters, with theoretical guarantees and empirical validation.
Findings
Method achieves comparable accuracy to traditional Bayesian methods.
Significantly reduces computational time in large surveys.
Demonstrated effectiveness in simulations and real survey data.
Abstract
The United States Bureau of Labor Statistics collects data using survey instruments under informative sampling designs that assign probabilities of inclusion to be correlated with the response. The bureau extensively uses Bayesian hierarchical models and posterior sampling to impute missing items in respondent-level data and to infer population parameters. Posterior sampling for survey data collected based on informative designs are computationally expensive and do not support production schedules of the bureau. Motivated by this problem, we propose a new method to scale Bayesian computations in informative sampling designs. Our method divides the data into smaller subsets, performs posterior sampling in parallel for every subset, and combines the collection of posterior samples from all the subsets through their mean in the Wasserstein space of order 2. Theoretically, we construct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
