Supervised dimensionality reduction for multiple imputation by chained equations
Edoardo Costantini, Kyle M. Lang, Klaas Sijtsma

TL;DR
This paper introduces a supervised dimensionality reduction technique integrated with MICE to improve handling of missing data, demonstrating through simulations that it outperforms traditional PCA in certain scenarios.
Contribution
It extends PCA within MICE by incorporating supervision, enhancing predictor selection for imputation models, and provides a comprehensive simulation comparison.
Findings
Supervised PCA improves imputation accuracy over unsupervised PCA.
Supervised approach reduces dimensionality while retaining relevant information.
Simulation results favor supervised PCA in various missing data scenarios.
Abstract
Multivariate imputation by chained equations (MICE) is one of the most popular approaches to address missing values in a data set. This approach requires specifying a univariate imputation model for every variable under imputation. The specification of which predictors should be included in these univariate imputation models can be a daunting task. Principal component analysis (PCA) can simplify this process by replacing all of the potential imputation model predictors with a few components summarizing their variance. In this article, we extend the use of PCA with MICE to include a supervised aspect whereby information from the variables under imputation is incorporated into the principal component estimation. We conducted an extensive simulation study to assess the statistical properties of MICE with different versions of supervised dimensionality reduction and we compared them with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSensory Analysis and Statistical Methods · Statistical Methods and Bayesian Inference · Bayesian Modeling and Causal Inference
