True to the Model or True to the Data?
Hugh Chen, Joseph D. Janizek, Scott Lundberg, Su-In Lee

TL;DR
This paper examines how the choice between interventional and observational conditional expectations affects Shapley value-based feature attribution in machine learning, emphasizing application-specific considerations and demonstrating impacts through linear models and real data examples.
Contribution
It clarifies the application-dependent choice between interventional and observational approaches for Shapley values and provides an efficient method for linear models.
Findings
Correlation impacts convergence of observational Shapley values
Different value functions perform better in credit risk and biological discovery
Modeling choices significantly influence feature attribution results
Abstract
A variety of recent papers discuss the application of Shapley values, a concept for explaining coalitional games, for feature attribution in machine learning. However, the correct way to connect a machine learning model to a coalitional game has been a source of controversy. The two main approaches that have been proposed differ in the way that they condition on known features, using either (1) an interventional or (2) an observational conditional expectation. While previous work has argued that one of the two approaches is preferable in general, we argue that the choice is application dependent. Furthermore, we argue that the choice comes down to whether it is desirable to be true to the model or true to the data. We use linear models to investigate this choice. After deriving an efficient method for calculating observational conditional expectation Shapley values for linear models, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Bayesian Modeling and Causal Inference · Statistical Methods and Inference
