Variance estimation for nearest neighbor imputation for US Census long form data
Jae Kwang Kim, Wayne A. Fuller, William R. Bell

TL;DR
This paper develops a variance estimation method for Census long form data that accounts for imputation uncertainty and population control raking, using a neighbor-based imputation procedure and adapting existing statistical techniques.
Contribution
It introduces a new variance estimation approach for Census data that incorporates multiple imputation and raking adjustments, specifically tailored for long form survey data.
Findings
The proposed method effectively estimates variance in Census data.
Numerical results demonstrate the method's applicability to 2000 Census data.
The approach improves variance estimation accuracy over previous methods.
Abstract
Variance estimation for estimators of state, county, and school district quantities derived from the Census 2000 long form are discussed. The variance estimator must account for (1) uncertainty due to imputation, and (2) raking to census population controls. An imputation procedure that imputes more than one value for each missing item using donors that are neighbors is described and the procedure using two nearest neighbors is applied to the Census long form. The Kim and Fuller [Biometrika 91 (2004) 559--578] method for variance estimation under fractional hot deck imputation is adapted for application to the long form data. Numerical results from the 2000 long form data are presented.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
