Explaining by Removing: A Unified Framework for Model Explanation
Ian Covert, Scott Lundberg, Su-In Lee

TL;DR
This paper introduces a unified framework for removal-based model explanation methods, clarifying their relationships, underlying principles, and theoretical foundations, thereby aiding practitioners in understanding and choosing among explanation techniques.
Contribution
The paper unifies 26 existing removal-based explanation methods into a comprehensive framework, connecting them through a common theoretical foundation and clarifying their differences and relationships.
Findings
Unifies 26 explanation methods under a common framework
Links removal-based explanations to cognitive psychology and game theory
Provides conditions for information-theoretic interpretations
Abstract
Researchers have proposed a wide variety of model explanation approaches, but it remains unclear how most methods are related or when one method is preferable to another. We describe a new unified class of methods, removal-based explanations, that are based on the principle of simulating feature removal to quantify each feature's influence. These methods vary in several respects, so we develop a framework that characterizes each method along three dimensions: 1) how the method removes features, 2) what model behavior the method explains, and 3) how the method summarizes each feature's influence. Our framework unifies 26 existing methods, including several of the most widely used approaches: SHAP, LIME, Meaningful Perturbations, and permutation tests. This newly understood class of explanation methods has rich connections that we examine using tools that have been largely overlooked by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Bayesian Modeling and Causal Inference · Advanced Causal Inference Techniques
MethodsLocal Interpretable Model-Agnostic Explanations
