The Mean Dimension of Neural Networks -- What causes the interaction effects?
Roman Hahn, Christoph Feinauer, Emanuele Borgonovo

TL;DR
This paper introduces a new method to estimate the mean dimension of neural networks directly from data, analyzes how interactions evolve across layers and architectures, and uses this to explain network behavior and structure.
Contribution
It proposes a novel estimation procedure for mean dimension that accounts for feature correlations and applies it to analyze neural network architectures and training dynamics.
Findings
Mean dimension varies across layers and architectures.
Activation functions influence interaction magnitudes.
Training affects the evolution of mean dimension.
Abstract
Owen and Hoyt recently showed that the effective dimension offers key structural information about the input-output mapping underlying an artificial neural network. Along this line of research, this work proposes an estimation procedure that allows the calculation of the mean dimension from a given dataset, without resampling from external distributions. The design yields total indices when features are independent and a variant of total indices when features are correlated. We show that this variant possesses the zero independence property. With synthetic datasets, we analyse how the mean dimension evolves layer by layer and how the activation function impacts the magnitude of interactions. We then use the mean dimension to study some of the most widely employed convolutional architectures for image recognition (LeNet, ResNet, DenseNet). To account for pixel correlations, we propose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Neural Networks and Applications
MethodsAverage Pooling · Max Pooling · Convolution · Residual Connection · Global Average Pooling · Batch Normalization · *Communicated@Fast*How Do I Communicate to Expedia? · 1x1 Convolution · Kaiming Initialization · Principal Components Analysis
