Bayesian Inference under Cluster Sampling with Probability Proportional to Size
Susanna Makela, Yajuan Si, Andrew Gelman

TL;DR
This paper introduces a Bayesian framework for two-stage cluster sampling with probability proportional to size, effectively predicting unknown cluster sizes and improving survey inference accuracy.
Contribution
It develops nonparametric and parametric Bayesian methods to handle unknown cluster sizes in PPS sampling, integrating outcome modeling and size prediction.
Findings
Bayesian methods outperform classical approaches in efficiency.
Integrated approach effectively predicts unknown cluster sizes.
Application to health survey demonstrates practical utility.
Abstract
Cluster sampling is common in survey practice, and the corresponding inference has been predominantly design-based. We develop a Bayesian framework for cluster sampling and account for the design effect in the outcome modeling. We consider a two-stage cluster sampling design where the clusters are first selected with probability proportional to cluster size, and then units are randomly sampled inside selected clusters. Challenges arise when the sizes of nonsampled cluster are unknown. We propose nonparametric and parametric Bayesian approaches for predicting the unknown cluster sizes, with this inference performed simultaneously with the model for survey outcome. Simulation studies show that the integrated Bayesian approach outperforms classical methods with efficiency gains. We use Stan for computing and apply the proposal to the Fragile Families and Child Wellbeing study as an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Bayesian Inference · Census and Population Estimation · Bayesian Methods and Mixture Models
