Imputation Uncertainty in Interpretable Machine Learning Methods

Pegah Golchian; Marvin N. Wright

arXiv:2512.17689·stat.ML·December 22, 2025

Imputation Uncertainty in Interpretable Machine Learning Methods

Pegah Golchian, Marvin N. Wright

PDF

Open Access

TL;DR

This paper investigates how different imputation methods affect the reliability of interpretability techniques in machine learning, highlighting the importance of multiple imputation for accurate variance estimation and confidence intervals.

Contribution

It demonstrates that single imputation underestimates variance in IML explanations, and advocates for multiple imputation to achieve proper confidence interval coverage.

Findings

01

Single imputation underestimates variance in IML explanations.

02

Multiple imputation achieves near-nominal confidence interval coverage.

03

Imputation choice significantly impacts interpretability reliability.

Abstract

In real data, missing values occur frequently, which affects the interpretation with interpretable machine learning (IML) methods. Recent work considers bias and shows that model explanations may differ between imputation methods, while ignoring additional imputation uncertainty and its influence on variance and confidence intervals. We therefore compare the effects of different imputation methods on the confidence interval coverage probabilities of the IML methods permutation feature importance, partial dependence plots and Shapley values. We show that single imputation leads to underestimation of variance and that, in most cases, only multiple imputation is close to nominal coverage.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Generative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning