Shapley values for cluster importance: How clusters of the training data   affect a prediction

Andreas Brands{\ae}ter; Ingrid K. Glad

arXiv:2012.03625·stat.ML·December 9, 2022·1 cites

Shapley values for cluster importance: How clusters of the training data affect a prediction

Andreas Brands{\ae}ter, Ingrid K. Glad

PDF

Open Access

TL;DR

This paper introduces a novel method using Shapley values to quantify how clusters of training data influence model predictions, enhancing interpretability of black-box models.

Contribution

It extends Shapley value concepts to cluster importance, enabling analysis of training data's impact on predictions in a new, insightful way.

Findings

01

Method effectively quantifies cluster influence on predictions.

02

Provides new insights into training data's role in model decisions.

03

Complementary to existing feature importance explanations.

Abstract

This paper proposes a novel approach to explain the predictions made by data-driven methods. Since such predictions rely heavily on the data used for training, explanations that convey information about how the training data affects the predictions are useful. The paper proposes a novel approach to quantify how different data-clusters of the training data affect a prediction. The quantification is based on Shapley values, a concept which originates from coalitional game theory, developed to fairly distribute the payout among a set of cooperating players. A player's Shapley value is a measure of that player's contribution. Shapley values are often used to quantify feature importance, ie. how features affect a prediction. This paper extends this to cluster importance, letting clusters of the training data act as players in a game where the predictions are the payouts. The novel…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Mining Algorithms and Applications