Using leave-one-out cross-validation (LOO) in a multilevel regression   and poststratification (MRP) workflow: A cautionary tale

Swen Kuh; Lauren Kennedy; Qixuan Chen; Andrew Gelman

arXiv:2209.01773·stat.ME·September 7, 2022·1 cites

Using leave-one-out cross-validation (LOO) in a multilevel regression and poststratification (MRP) workflow: A cautionary tale

Swen Kuh, Lauren Kennedy, Qixuan Chen, Andrew Gelman

PDF

Open Access

TL;DR

This paper critically examines the use of leave-one-out cross-validation (LOO) methods in validating multilevel regression and poststratification (MRP) models, revealing limitations and cautioning against sole reliance on these techniques for model assessment.

Contribution

It provides an empirical evaluation of LOO-based validation methods in MRP, highlighting their shortcomings and suggesting cautious application in practice.

Findings

01

LOO methods do not reliably recover true model rankings in MRP.

02

Model validation accuracy varies across small areas and priors.

03

LOO-based criteria may mislead model selection in MRP contexts.

Abstract

In recent decades, multilevel regression and poststratification (MRP) has surged in popularity for population inference. However, the validity of the estimates can depend on details of the model, and there is currently little research on validation. We explore how leave-one-out cross-validation (LOO) can be used to compare Bayesian models for MRP. We investigate two approximate calculations of LOO, the Pareto smoothed importance sampling (PSIS-LOO) and a survey-weighted alternative (WTD-PSIS-LOO). Using two simulation designs, we examine how accurately these two criteria recover the correct ordering of model goodness at predicting population and small area level estimands. Focusing first on variable selection, we find that neither PSIS-LOO nor WTD-PSIS-LOO correctly recovers the models' order for an MRP population estimand (although both criteria correctly identify the best and worst…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHealth disparities and outcomes · Statistical Methods and Bayesian Inference · Advanced Causal Inference Techniques