TL;DR
This paper introduces a new method for selecting baselines in Shapley value explanations of neural networks, using a neutrality parameter that reflects how decision-makers interpret model predictions, improving interpretability.
Contribution
The paper proposes a data-driven baseline selection method for Shapley values based on neutrality, addressing the arbitrary baseline choice issue in feature attribution.
Findings
The proposed baseline improves interpretability in binary classification.
Empirical validation on synthetic and financial datasets supports effectiveness.
Baseline choice impacts the quality of feature attribution explanations.
Abstract
Deep neural networks have gained momentum based on their accuracy, but their interpretability is often criticised. As a result, they are labelled as black boxes. In response, several methods have been proposed in the literature to explain their predictions. Among the explanatory methods, Shapley values is a feature attribution method favoured for its robust theoretical foundation. However, the analysis of feature attributions using Shapley values requires choosing a baseline that represents the concept of missingness. An arbitrary choice of baseline could negatively impact the explanatory power of the method and possibly lead to incorrect interpretations. In this paper, we present a method for choosing a baseline according to a neutrality value: as a parameter selected by decision-makers, the point at which their choices are determined by the model predictions being either above or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
