Neuron-based explanations of neural networks sacrifice completeness and   interpretability

Nolan Dey; Eric Taylor; Alexander Wong; Bryan Tripp and; Graham W. Taylor

arXiv:2011.03043·cs.LG·March 20, 2025

Neuron-based explanations of neural networks sacrifice completeness and interpretability

Nolan Dey, Eric Taylor, Alexander Wong, Bryan Tripp and, Graham W. Taylor

PDF

Open Access

TL;DR

This paper demonstrates that neuron-based explanations for AlexNet are less complete and interpretable than those based on principal components, due to the distributed nature of neural representations.

Contribution

It provides evidence that principal component-based explanations outperform neuron-based ones in completeness and interpretability for AlexNet.

Findings

01

Principal components offer more complete explanations than neurons.

02

High-variance principal components are more interpretable.

03

Neuron-based explanations sacrifice interpretability and completeness.

Abstract

High quality explanations of neural networks (NNs) should exhibit two key properties. Completeness ensures that they accurately reflect a network's function and interpretability makes them understandable to humans. Many existing methods provide explanations of individual neurons within a network. In this work we provide evidence that for AlexNet pretrained on ImageNet, neuron-based explanation methods sacrifice both completeness and interpretability compared to activation principal components. Neurons are a poor basis for AlexNet embeddings because they don't account for the distributed nature of these representations. By examining two quantitative measures of completeness and conducting a user study to measure interpretability, we show the most important principal components provide more complete and interpretable explanations than the most important neurons. Much of the activation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural dynamics and brain function · Neural Networks and Applications · EEG and Brain-Computer Interfaces