Small area estimation using incomplete auxiliary information
Donatas \v{S}levinskas, Ieva Burakauskait\.e, Andrius \v{C}iginas

TL;DR
This paper introduces a two-step small area estimation method that combines incomplete auxiliary data from non-probability sources with survey data, improving precision without modeling selection bias.
Contribution
It develops a novel approach integrating design-based calibration and measurement-error models to effectively utilize incomplete auxiliary information in small area estimation.
Findings
Improved domain-level precision over traditional methods.
Effective integration of administrative and web data sources.
No need to model non-probability sample selection mechanism.
Abstract
Auxiliary information is increasingly available from administrative and other data sources, but it is often incomplete and of non-probability origin. We propose a two-step small area estimation approach in which the first step relies on design-based model calibration and exploits a large non-probability source providing a noisy proxy of the study variable for only part of the population. A unit-level measurement-error working model is fitted on the linked overlap between the probability survey and the external source, and its predictions are incorporated through domain-specific model-calibration constraints to obtain approximately design-unbiased domain totals. These totals and their variance estimates are then used in a Fay-Herriot area-level model with exactly known covariates to produce empirical best linear unbiased predictors. The approach is demonstrated in three enterprise survey…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Bayesian Inference · demographic modeling and climate adaptation · Spatial and Panel Data Analysis
