Naive imputation implicitly regularizes high-dimensional linear models
Alexis Ayme (LPSM (UMR\_8001)), Claire Boyer (LPSM (UMR\_8001)),, Aymeric Dieuleveut (CMAP), Erwan Scornet (CMAP)

TL;DR
This paper demonstrates that naive zero imputation in high-dimensional linear models acts as an implicit regularizer similar to ridge regression, explaining its surprisingly good predictive performance despite bias.
Contribution
It provides a theoretical analysis linking zero imputation to ridge regularization and recommends averaged SGD on imputed data for effective prediction.
Findings
Zero imputation performs implicit ridge regularization.
Imputation bias diminishes in high-dimensional settings.
Averaged SGD on imputed data yields good generalization bounds.
Abstract
Two different approaches exist to handle missing values for prediction: either imputation, prior to fitting any predictive algorithms, or dedicated methods able to natively incorporate missing values. While imputation is widely (and easily) use, it is unfortunately biased when low-capacity predictors (such as linear models) are applied afterward. However, in practice, naive imputation exhibits good predictive performance. In this paper, we study the impact of imputation in a high-dimensional linear model with MCAR missing data. We prove that zero imputation performs an implicit regularization closely related to the ridge method, often used in high-dimensional problems. Leveraging on this connection, we establish that the imputation bias is controlled by a ridge bias, which vanishes in high dimension. As a predictor, we argue in favor of the averaged SGD strategy, applied to zero-imputed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Statistical Methods and Inference
