Not Just a Black Box: Learning Important Features Through Propagating   Activation Differences

Avanti Shrikumar; Peyton Greenside; Anna Shcherbina; Anshul Kundaje

arXiv:1605.01713·cs.LG·April 12, 2017·552 cites

Not Just a Black Box: Learning Important Features Through Propagating Activation Differences

Avanti Shrikumar, Peyton Greenside, Anna Shcherbina, Anshul Kundaje

PDF

Open Access 1 Repo

TL;DR

This paper introduces DeepLIFT, a method for interpreting neural networks by comparing neuron activations to reference states, providing more accurate importance scores than gradient-based approaches.

Contribution

DeepLIFT offers an efficient way to compute feature importance in neural networks by propagating activation differences, improving interpretability over existing gradient-based methods.

Findings

01

DeepLIFT outperforms gradient-based methods in importance scoring.

02

Applied to image and genomic models, demonstrating broad applicability.

03

Provides more stable and meaningful importance scores.

Abstract

Note: This paper describes an older version of DeepLIFT. See https://arxiv.org/abs/1704.02685 for the newer version. Original abstract follows: The purported "black box" nature of neural networks is a barrier to adoption in applications where interpretability is essential. Here we present DeepLIFT (Learning Important FeaTures), an efficient and effective method for computing importance scores in a neural network. DeepLIFT compares the activation of each neuron to its 'reference activation' and assigns contribution scores according to the difference. We apply DeepLIFT to models trained on natural images and genomic data, and show significant advantages over gradient-based methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pytorch/captum
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Machine Learning and Data Classification · Domain Adaptation and Few-Shot Learning

MethodsInterpretability