Towards better understanding of gradient-based attribution methods for   Deep Neural Networks

Marco Ancona; Enea Ceolini; Cengiz \"Oztireli; Markus Gross

arXiv:1711.06104·cs.LG·March 8, 2018·30 cites

Towards better understanding of gradient-based attribution methods for Deep Neural Networks

Marco Ancona, Enea Ceolini, Cengiz \"Oztireli, Markus Gross

PDF

Open Access 2 Repos

TL;DR

This paper provides a formal comparison and unified framework for four gradient-based attribution methods in DNNs, introduces a new evaluation metric, and empirically assesses their effectiveness across multiple datasets and architectures.

Contribution

It offers a formal analysis of attribution methods, a unified framework for comparison, and introduces a novel evaluation metric for attribution quality.

Findings

01

Formal conditions of equivalence and approximation between methods

02

A new unified framework for gradient-based attribution methods

03

Empirical evaluation using Sensitivity-n metric across datasets

Abstract

Understanding the flow of information in Deep Neural Networks (DNNs) is a challenging problem that has gain increasing attention over the last few years. While several methods have been proposed to explain network predictions, there have been only a few attempts to compare them from a theoretical perspective. What is more, no exhaustive empirical comparison has been performed in the past. In this work, we analyze four gradient-based attribution methods and formally prove conditions of equivalence and approximation between them. By reformulating two of these methods, we construct a unified framework which enables a direct comparison, as well as an easier implementation. Finally, we propose a novel evaluation metric, called Sensitivity-n and test the gradient-based attribution methods alongside with a simple perturbation-based attribution method on several datasets in the domains of image…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Domain Adaptation and Few-Shot Learning