NeuroView: Explainable Deep Network Decision Making
CJ Barberan, Randall Balestriero, Richard G. Baraniuk

TL;DR
NeuroView introduces an interpretable neural network architecture that links unit states directly to classification decisions, enhancing understanding of deep network decision-making.
Contribution
It proposes a novel neural network design that is inherently explainable by linking unit outputs to decisions through vector quantization and linear classification.
Findings
NeuroView achieves comparable accuracy to standard DNs.
The architecture provides clear interpretability of unit contributions.
Validation on standard datasets demonstrates effective decision explanation.
Abstract
Deep neural networks (DNs) provide superhuman performance in numerous computer vision tasks, yet it remains unclear exactly which of a DN's units contribute to a particular decision. NeuroView is a new family of DN architectures that are interpretable/explainable by design. Each member of the family is derived from a standard DN architecture by vector quantizing the unit output values and feeding them into a global linear classifier. The resulting architecture establishes a direct, causal link between the state of each unit and the classification decision. We validate NeuroView on standard datasets and classification tasks to show that how its unit/class mapping aids in understanding the decision-making process.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications
