On cross-validation for small area estimators
Qianyu Dong, Zehang Richard Li

TL;DR
This paper introduces a new cross-validation framework for evaluating small area estimators in public health surveys, addressing challenges of model comparison and uncertainty quantification.
Contribution
It develops a model-agnostic, theoretically grounded cross-validation method that improves comparison of SAE models with complex survey designs.
Findings
Conventional leave-one-area-out CV can mislead model rankings.
The proposed framework provides more robust model comparison.
Simulation studies validate the effectiveness of the new approach.
Abstract
Subnational monitoring of public health often relies on household surveys where data are sparse at the desired spatial resolution. Small area estimation (SAE) methods address this challenge by borrowing strength across areas and incorporating auxiliary information. However, comparing these estimators remains difficult in the absence of ground truth. We propose a cross-validation framework for evaluating small area estimators that accommodates complex survey designs. Our approach enables model-agnostic comparisons between area-level and unit-level SAE models. Central to our framework is a decomposition of the cross-validated squared error, which reveals both identifiable bias and unidentifiable components that can be bounded. Our theoretical results and simulation studies show that conventional approaches, such as leave-one-area-out cross-validation, can yield misleading model rankings,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
