A Comparative Study of Methods for Estimating Conditional Shapley Values and When to Use Them
Lars Henry Berge Olsen, Ingrid Kristine Glad, Martin Jullum and, Kjersti Aas

TL;DR
This paper compares various methods for estimating conditional Shapley values in machine learning, evaluating their accuracy and practicality through simulations and real data, and provides guidelines for choosing the appropriate approach.
Contribution
It introduces new methods, extends existing approaches, and systematically compares different classes of methods for estimating conditional Shapley values.
Findings
Parametric methods are most accurate when data distribution is well specified.
Regression-based methods are slower to train but faster at explanation time.
Monte Carlo methods are suitable when quick explanations are needed without extensive training.
Abstract
Shapley values originated in cooperative game theory but are extensively used today as a model-agnostic explanation framework to explain predictions made by complex machine learning models in the industry and academia. There are several algorithmic approaches for computing different versions of Shapley value explanations. Here, we focus on conditional Shapley values for predictive models fitted to tabular data. Estimating precise conditional Shapley values is difficult as they require the estimation of non-trivial conditional expectations. In this article, we develop new methods, extend earlier proposed approaches, and systematize the new refined and existing methods into different method classes for comparison and evaluation. The method classes use either Monte Carlo integration or regression to model the conditional expectations. We conduct extensive simulation studies to evaluate how…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Sports Analytics and Performance · Machine Learning and Data Classification
