Joint imputation procedures for categorical variables
H\'el\`ene Chaput, Guillaume Chauvet, David Haziza, Laurianne, Salembier, Julie Solard

TL;DR
This paper introduces a new random imputation method for categorical survey data that preserves variable relationships and improves estimation accuracy, addressing biases caused by traditional marginal imputation methods.
Contribution
It proposes a simple, relationship-preserving random imputation procedure and an efficient version, enhancing the accuracy of bivariate parameter estimates in categorical data.
Findings
The new method maintains relationships between categorical variables.
The efficient version improves estimation bias and variance.
Simulation results show better performance compared to existing methods.
Abstract
Marginal imputation, which consists of imputing each item requiring imputation separately, is often used in surveys. This type of imputation procedures leads to asymptotically unbiased estimators of simple parameters such as population totals (or means), but tends to distort relationships between variables. As a result, it generally leads to biased estimators of bivariate parameters such as coefficients of correlation or odd-ratios. Household and social surveys typically collect categorical variables, for which missing values are usually handled by nearest-neighbour imputation or random hot-deck imputation. In this paper, we propose a simple random imputation procedure, closely related to random hot-deck imputation, which succeeds in preserving the relationship between categorical variables. Also, a fully efficient version of the latter procedure is proposed. A limited simulation study…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSurvey Methodology and Nonresponse · Survey Sampling and Estimation Techniques · Statistical Methods and Bayesian Inference
