Influence-Directed Explanations for Deep Convolutional Networks

Klas Leino; Shayak Sen; Anupam Datta; Matt Fredrikson; Linyi Li

arXiv:1802.03788·cs.LG·November 14, 2018·6 cites

Influence-Directed Explanations for Deep Convolutional Networks

Klas Leino, Shayak Sen, Anupam Datta, Matt Fredrikson, Linyi Li

PDF

Open Access 2 Repos

TL;DR

This paper introduces influence-directed explanations for deep neural networks, enabling the identification of influential neurons and concepts, and providing insights into the network's decision-making process on ImageNet.

Contribution

It proposes a new influence measure and a method to interpret neurons, revealing influential concepts and decision features in convolutional neural networks.

Findings

01

Identifies influential concepts that generalize across instances

02

Extracts the core learned features of classes

03

Isolates features used for decision-making and class distinction

Abstract

We study the problem of explaining a rich class of behavioral properties of deep neural networks. Distinctively, our influence-directed explanations approach this problem by peering inside the network to identify neurons with high influence on a quantity and distribution of interest, using an axiomatically-justified influence measure, and then providing an interpretation for the concepts these neurons represent. We evaluate our approach by demonstrating a number of its unique capabilities on convolutional neural networks trained on ImageNet. Our evaluation demonstrates that influence-directed explanations (1) identify influential concepts that generalize across instances, (2) can be used to extract the "essence" of what the network learned about a class, and (3) isolate individual features the network uses to make decisions and distinguish related classes.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification