# Don't be fooled: label leakage in explanation methods and the importance   of their quantitative evaluation

**Authors:** Neil Jethani, Adriel Saporta, Rajesh Ranganath

arXiv: 2302.12893 · 2023-02-28

## TL;DR

This paper reveals that class-dependent feature attribution methods can leak class information, leading to misleading explanations, and proposes distribution-aware methods like SHAP-KL to improve explanation reliability across various data types.

## Contribution

The work identifies class leakage issues in existing attribution methods and introduces distribution-aware approaches, including SHAP-KL, with comprehensive evaluation on diverse datasets.

## Key findings

- Class-dependent methods can leak class information.
- Distribution-aware methods reduce class leakage.
- Evaluation shows improved explanation fidelity.

## Abstract

Feature attribution methods identify which features of an input most influence a model's output. Most widely-used feature attribution methods (such as SHAP, LIME, and Grad-CAM) are "class-dependent" methods in that they generate a feature attribution vector as a function of class. In this work, we demonstrate that class-dependent methods can "leak" information about the selected class, making that class appear more likely than it is. Thus, an end user runs the risk of drawing false conclusions when interpreting an explanation generated by a class-dependent method. In contrast, we introduce "distribution-aware" methods, which favor explanations that keep the label's distribution close to its distribution given all features of the input. We introduce SHAP-KL and FastSHAP-KL, two baseline distribution-aware methods that compute Shapley values. Finally, we perform a comprehensive evaluation of seven class-dependent and three distribution-aware methods on three clinical datasets of different high-dimensional data types: images, biosignals, and text.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2302.12893/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/2302.12893/full.md

## References

44 references — full list in the complete paper: https://tomesphere.com/paper/2302.12893/full.md

---
Source: https://tomesphere.com/paper/2302.12893