Cause and Effect: Hierarchical Concept-based Explanation of Neural   Networks

Mohammad Nokhbeh Zaeem; Majid Komeili

arXiv:2105.07033·cs.LG·November 9, 2021·1 cites

Cause and Effect: Hierarchical Concept-based Explanation of Neural Networks

Mohammad Nokhbeh Zaeem, Majid Komeili

PDF

Open Access

TL;DR

This paper introduces a framework for interpreting neural networks by analyzing the causal relationships between high-level concepts and output classes, and constructing concept hierarchies to understand internal interactions.

Contribution

It proposes four measures for quantifying causality between concepts and classes, and a method to build concept-based decision trees for hierarchical explanations.

Findings

01

Effectively identifies causal relationships between concepts and network outputs

02

Constructs concept hierarchies revealing internal concept interactions

03

Demonstrates improved interpretability of neural networks

Abstract

In many scenarios, human decisions are explained based on some high-level concepts. In this work, we take a step in the interpretability of neural networks by examining their internal representation or neuron's activations against concepts. A concept is characterized by a set of samples that have specific features in common. We propose a framework to check the existence of a causal relationship between a concept (or its negation) and task classes. While the previous methods focus on the importance of a concept to a task class, we go further and introduce four measures to quantitatively determine the order of causality. Moreover, we propose a method for constructing a hierarchy of concepts in the form of a concept-based decision tree which can shed light on how various concepts interact inside a neural network towards predicting output classes. Through experiments, we demonstrate the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification