Imputation Uncertainty in Interpretable Machine Learning Methods
Pegah Golchian, Marvin N. Wright

TL;DR
This paper investigates how different imputation methods affect the reliability of interpretability techniques in machine learning, highlighting the importance of multiple imputation for accurate variance estimation and confidence intervals.
Contribution
It demonstrates that single imputation underestimates variance in IML explanations, and advocates for multiple imputation to achieve proper confidence interval coverage.
Findings
Single imputation underestimates variance in IML explanations.
Multiple imputation achieves near-nominal confidence interval coverage.
Imputation choice significantly impacts interpretability reliability.
Abstract
In real data, missing values occur frequently, which affects the interpretation with interpretable machine learning (IML) methods. Recent work considers bias and shows that model explanations may differ between imputation methods, while ignoring additional imputation uncertainty and its influence on variance and confidence intervals. We therefore compare the effects of different imputation methods on the confidence interval coverage probabilities of the IML methods permutation feature importance, partial dependence plots and Shapley values. We show that single imputation leads to underestimation of variance and that, in most cases, only multiple imputation is close to nominal coverage.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Generative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning
