On the limits of neural network explainability via descrambling

Shashank Sule; Richard G. Spencer; Wojciech Czaja

arXiv:2301.07820·cs.LG·September 4, 2024·1 cites

On the limits of neural network explainability via descrambling

Shashank Sule, Richard G. Spencer, Wojciech Czaja

PDF

Open Access 1 Repo

TL;DR

This paper characterizes the mathematical solutions for neural network descrambling, revealing how principal components and eigendecompositions can explain hidden layer transformations and improve interpretability.

Contribution

It provides a formal characterization of neural network descrambling solutions using the Brockett function, connecting explainability to eigendecomposition and principal components.

Findings

01

Descramblers include Fourier basis modes and semantic features.

02

Eigendecomposition reveals underlying transformations of hidden layers.

03

SVD relates closely to neural network explainability.

Abstract

We characterize the exact solutions to neural network descrambling--a mathematical model for explaining the fully connected layers of trained neural networks (NNs). By reformulating the problem to the minimization of the Brockett function arising in graph matching and complexity theory we show that the principal components of the hidden layer preactivations can be characterized as the optimal explainers or descramblers for the layer weights, leading to descrambled weight matrices. We show that in typical deep learning contexts these descramblers take diverse and interesting forms including (1) matching largest principal components with the lowest frequency modes of the Fourier basis for isotropic hidden data, (2) discovering the semantic development in two-layer linear NNs for signal recovery problems, and (3) explaining CNNs by optimally permuting the neurons. Our numerical experiments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shashanksule/esvd
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Statistical Mechanics and Entropy · Model Reduction and Neural Networks