Regression-adjusted Monte Carlo Estimators for Shapley Values and Probabilistic Values
R. Teal Witter, Yurong Liu, Christopher Musco

TL;DR
This paper introduces a novel regression-adjusted Monte Carlo method for efficiently estimating probabilistic values like Shapley values, combining sampling and flexible regression techniques to achieve state-of-the-art accuracy in explainable AI.
Contribution
It presents a flexible framework that replaces linear regression with any function family, enabling more accurate and unbiased probabilistic value estimates, including the use of tree-based models.
Findings
State-of-the-art performance on eight datasets
Error reduction up to 6.5x for Shapley values
Error reduction up to 215x for general probabilistic values
Abstract
With origins in game theory, probabilistic values like Shapley values, Banzhaf values, and semi-values have emerged as a central tool in explainable AI. They are used for feature attribution, data attribution, data valuation, and more. Since all of these values require exponential time to compute exactly, research has focused on efficient approximation methods using two techniques: Monte Carlo sampling and linear regression formulations. In this work, we present a new way of combining both of these techniques. Our approach is more flexible than prior algorithms, allowing for linear regression to be replaced with any function family whose probabilistic values can be computed efficiently. This allows us to harness the accuracy of tree-based models like XGBoost, while still producing unbiased estimates. From experiments across eight datasets, we find that our methods give state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsForecasting Techniques and Applications
