When Explanations Lie: Why Many Modified BP Attributions Fail

Leon Sixt; Maximilian Granz; Tim Landgraf

arXiv:1912.09818·cs.LG·February 20, 2024·42 cites

When Explanations Lie: Why Many Modified BP Attributions Fail

Leon Sixt, Maximilian Granz, Tim Landgraf

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper critically examines various modified backpropagation attribution methods, revealing that most explanations ignore later layer information, and offers theoretical and empirical insights into their limitations.

Contribution

The paper provides a comprehensive analysis of modified BP attribution methods, identifying their limitations and introducing a new metric to evaluate their faithfulness.

Findings

01

Most methods ignore later layer information

02

DeepLIFT does not suffer from this limitation

03

Introduces cosine similarity convergence (CSC) metric

Abstract

Attribution methods aim to explain a neural network's prediction by highlighting the most relevant image areas. A popular approach is to backpropagate (BP) a custom relevance score using modified rules, rather than the gradient. We analyze an extensive set of modified BP methods: Deep Taylor Decomposition, Layer-wise Relevance Propagation (LRP), Excitation BP, PatternAttribution, DeepLIFT, Deconv, RectGrad, and Guided BP. We find empirically that the explanations of all mentioned methods, except for DeepLIFT, are independent of the parameters of later layers. We provide theoretical insights for this surprising behavior and also analyze why DeepLIFT does not suffer from this limitation. Empirically, we measure how information of later layers is ignored by using our new metric, cosine similarity convergence (CSC). The paper provides a framework to assess the faithfulness of new and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

berleon/when-explanations-lie
tfOfficial

Videos

When Explanations Lie: Why Many Modified BP Attributions Fail· slideslive

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Advanced Neural Network Applications