All Models are Wrong, but Many are Useful: Learning a Variable's Importance by Studying an Entire Class of Prediction Models Simultaneously
Aaron Fisher, Cynthia Rudin, Francesca Dominici

TL;DR
This paper introduces model class reliance (MCR), a new approach to measure variable importance across all well-performing models within a class, providing a comprehensive understanding of variable contributions.
Contribution
It proposes MCR as a novel metric for variable importance that accounts for multiple models, and derives theoretical connections and bounds for permutation-based importance measures.
Findings
MCR captures the range of variable importance across models.
Connections established between permutation importance, U-statistics, and linear coefficients.
Probabilistic bounds for MCR derived.
Abstract
Variable importance (VI) tools describe how much covariates contribute to a prediction model's accuracy. However, important variables for one well-performing model (for example, a linear model with a fixed coefficient vector ) may be unimportant for another model. In this paper, we propose model class reliance (MCR) as the range of VI values across all well-performing model in a prespecified class. Thus, MCR gives a more comprehensive description of importance by accounting for the fact that many prediction models, possibly of different parametric forms, may fit the data well. In the process of deriving MCR, we show several informative results for permutation-based VI estimates, based on the VI measures used in Random Forests. Specifically, we derive connections between permutation importance estimates for a single prediction model,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Imbalanced Data Classification Techniques · Machine Learning and Data Classification
