Imputation Scores
Jeffrey N\"af, Meta-Lina Spohn, Loris Michel, Nicolai Meinshausen

TL;DR
This paper introduces Imputation Scores (I-Scores), a new framework for evaluating imputation methods that accurately rank those sampling from true distributions without needing to mask data, applicable to various data types.
Contribution
The paper develops a novel I-Score framework based on density ratios, applicable to discrete and continuous data, and proves its propriety under MCAR and MAR assumptions.
Findings
I-Scores effectively rank true data high and avoid common pitfalls of RMSE.
The method is applicable without masking additional observations.
The R-package Iscores is available on CRAN.
Abstract
Given the prevalence of missing data in modern statistical research, a broad range of methods is available for any given imputation task. How does one choose the `best' imputation method in a given application? The standard approach is to select some observations, set their status to missing, and compare prediction accuracy of the methods under consideration of these observations. Besides having to somewhat artificially mask observations, a shortcoming of this approach is that imputations based on the conditional mean will rank highest if predictive accuracy is measured with quadratic loss. In contrast, we want to rank highest an imputation that can sample from the true conditional distributions. In this paper, we develop a framework called "Imputation Scores" (I-Scores) for assessing missing value imputations. We provide a specific I-Score based on density ratios and projections, that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Bayesian Inference · Statistical Methods and Inference · Probability and Risk Models
