Data Fusion for Correcting Measurement Errors
Tracy Schifeling, Jerome P. Reiter, Maria DeYoreo

TL;DR
This paper introduces a data fusion framework that leverages high-quality auxiliary data to correct measurement errors in survey data, improving inference accuracy without relying on the typical independence assumptions.
Contribution
It develops a novel modeling approach for measurement error correction using data fusion, accounting for error rates and reporting behaviors, and applies it to real survey data.
Findings
Improved accuracy in estimating educational attainment distributions.
Effective correction of measurement errors using auxiliary high-quality data.
Sensitivity analysis methods for measurement error models.
Abstract
Often in surveys, key items are subject to measurement errors. Given just the data, it can be difficult to determine the distribution of this error process, and hence to obtain accurate inferences that involve the error-prone variables. In some settings, however, analysts have access to a data source on different individuals with high quality measurements of the error-prone survey items. We present a data fusion framework for leveraging this information to improve inferences in the error-prone survey. The basic idea is to posit models about the rates at which individuals make errors, coupled with models for the values reported when errors are made. This can avoid the unrealistic assumption of conditional independence typically used in data fusion. We apply the approach on the reported values of educational attainments in the American Community Survey, using the National Survey of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
