Relating the Partial Dependence Plot and Permutation Feature Importance to the Data Generating Process
Christoph Molnar, Timo Freiesleben, Gunnar K\"onig, Giuseppe, Casalicchio, Marvin N. Wright, Bernd Bischl

TL;DR
This paper formalizes partial dependence plots and permutation feature importance as estimators of the data generating process, analyzing their biases and proposing methods to improve their reliability.
Contribution
It introduces a formal framework relating PD and PFI to the data generating process and proposes new estimators that account for model variance and bias.
Findings
PD and PFI estimates deviate from ground truth due to biases and variance.
Proposed learner-PD and learner-PFI improve estimation accuracy.
New confidence interval estimators enhance interpretability.
Abstract
Scientists and practitioners increasingly rely on machine learning to model data and draw conclusions. Compared to statistical modeling approaches, machine learning makes fewer explicit assumptions about data structures, such as linearity. However, their model parameters usually cannot be easily related to the data generating process. To learn about the modeled relationships, partial dependence (PD) plots and permutation feature importance (PFI) are often used as interpretation methods. However, PD and PFI lack a theory that relates them to the data generating process. We formalize PD and PFI as statistical estimators of ground truth estimands rooted in the data generating process. We show that PD and PFI estimates deviate from this ground truth due to statistical biases, model variance and Monte Carlo approximation errors. To account for model variance in PD and PFI estimation, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Data Analysis with R · Machine Learning and Data Classification
