On the limits of neural network explainability via descrambling
Shashank Sule, Richard G. Spencer, Wojciech Czaja

TL;DR
This paper characterizes the mathematical solutions for neural network descrambling, revealing how principal components and eigendecompositions can explain hidden layer transformations and improve interpretability.
Contribution
It provides a formal characterization of neural network descrambling solutions using the Brockett function, connecting explainability to eigendecomposition and principal components.
Findings
Descramblers include Fourier basis modes and semantic features.
Eigendecomposition reveals underlying transformations of hidden layers.
SVD relates closely to neural network explainability.
Abstract
We characterize the exact solutions to neural network descrambling--a mathematical model for explaining the fully connected layers of trained neural networks (NNs). By reformulating the problem to the minimization of the Brockett function arising in graph matching and complexity theory we show that the principal components of the hidden layer preactivations can be characterized as the optimal explainers or descramblers for the layer weights, leading to descrambled weight matrices. We show that in typical deep learning contexts these descramblers take diverse and interesting forms including (1) matching largest principal components with the lowest frequency modes of the Fourier basis for isotropic hidden data, (2) discovering the semantic development in two-layer linear NNs for signal recovery problems, and (3) explaining CNNs by optimally permuting the neurons. Our numerical experiments…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Statistical Mechanics and Entropy · Model Reduction and Neural Networks
