TL;DR
NeuroCartography introduces scalable methods to automatically discover, group, and visualize concepts encoded by neurons in deep neural networks, enabling better understanding of complex model representations.
Contribution
The paper presents NeuroCartography, a system with novel scalable clustering and embedding techniques for concept summarization in neural networks, capable of handling large datasets like ImageNet.
Findings
Neuron groups correspond to meaningful concepts
Scalable techniques operate in linear time relative to neurons
System enables surprising insights into neural representations
Abstract
Existing research on making sense of deep neural networks often focuses on neuron-level interpretation, which may not adequately capture the bigger picture of how concepts are collectively encoded by multiple neurons. We present NeuroCartography, an interactive system that scalably summarizes and visualizes concepts learned by neural networks. It automatically discovers and groups neurons that detect the same concepts, and describes how such neuron groups interact to form higher-level concepts and the subsequent predictions. NeuroCartography introduces two scalable summarization techniques: (1) neuron clustering groups neurons based on the semantic similarity of the concepts detected by neurons (e.g., neurons detecting "dog faces" of different breeds are grouped); and (2) neuron embedding encodes the associations between related concepts based on how often they co-occur (e.g., neurons…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
