Inference from Non-Random Samples Using Bayesian Machine Learning

Yutao Liu; Andrew Gelman; Qixuan Chen

arXiv:2104.05192·stat.ME·April 13, 2021·1 cites

Inference from Non-Random Samples Using Bayesian Machine Learning

Yutao Liu, Andrew Gelman, Qixuan Chen

PDF

Open Access

TL;DR

This paper introduces a Bayesian machine learning approach for inference from non-random samples, leveraging auxiliary variables and propensity scores to improve population estimates with valid uncertainty quantification.

Contribution

It develops a regularized prediction method using Bayesian additive regression trees that accounts for non-random sampling and incorporates propensity scores for better inference.

Findings

01

Valid population mean inference achieved in simulations

02

Coverage rates close to nominal levels

03

Effective application demonstrated in survey and epidemiology data

Abstract

We consider inference from non-random samples in data-rich settings where high-dimensional auxiliary information is available both in the sample and the target population, with survey inference being a special case. We propose a regularized prediction approach that predicts the outcomes in the population using a large number of auxiliary variables such that the ignorability assumption is reasonable while the Bayesian framework is straightforward for quantification of uncertainty. Besides the auxiliary variables, inspired by Little & An (2004), we also extend the approach by estimating the propensity score for a unit to be included in the sample and also including it as a predictor in the machine learning models. We show through simulation studies that the regularized predictions using soft Bayesian additive regression trees yield valid inference for the population means and coverage…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Bayesian Inference · Statistical Methods and Inference · Advanced Causal Inference Techniques