Variance estimation for nearest neighbor imputation for US Census long   form data

Jae Kwang Kim; Wayne A. Fuller; William R. Bell

arXiv:1108.1074·stat.AP·August 5, 2011

Variance estimation for nearest neighbor imputation for US Census long form data

Jae Kwang Kim, Wayne A. Fuller, William R. Bell

PDF

TL;DR

This paper develops a variance estimation method for Census long form data that accounts for imputation uncertainty and population control raking, using a neighbor-based imputation procedure and adapting existing statistical techniques.

Contribution

It introduces a new variance estimation approach for Census data that incorporates multiple imputation and raking adjustments, specifically tailored for long form survey data.

Findings

01

The proposed method effectively estimates variance in Census data.

02

Numerical results demonstrate the method's applicability to 2000 Census data.

03

The approach improves variance estimation accuracy over previous methods.

Abstract

Variance estimation for estimators of state, county, and school district quantities derived from the Census 2000 long form are discussed. The variance estimator must account for (1) uncertainty due to imputation, and (2) raking to census population controls. An imputation procedure that imputes more than one value for each missing item using donors that are neighbors is described and the procedure using two nearest neighbors is applied to the Census long form. The Kim and Fuller [Biometrika 91 (2004) 559--578] method for variance estimation under fractional hot deck imputation is adapted for application to the long form data. Numerical results from the 2000 long form data are presented.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.