Normalization effects on deep neural networks
Jiahui Yu, Konstantinos Spiliopoulos

TL;DR
This paper investigates how different normalization schemes, especially mean-field scaling, affect the statistical behavior and test accuracy of deep neural networks, providing insights for hyperparameter tuning.
Contribution
It offers a mathematical analysis of normalization effects, demonstrating that mean-field scaling optimizes variance and accuracy, and guides hyperparameter selection.
Findings
Mean-field scaling (γ=1) yields optimal variance and accuracy.
Outer layer scaling has a greater impact on network behavior.
Mathematical analysis informs systematic hyperparameter choice.
Abstract
We study the effect of normalization on the layers of deep neural networks of feed-forward type. A given layer with hidden units is allowed to be normalized by with and we study the effect of the choice of the on the statistical behavior of the neural network's output (such as variance) as well as on the test accuracy on the MNIST data set. We find that in terms of variance of the neural network's output and test accuracy the best choice is to choose the 's to be equal to one, which is the mean-field scaling. We also find that this is particularly true for the outer layer, in that the neural network's behavior is more sensitive in the scaling of the outer layer as opposed to the scaling of the inner layers. The mechanism for the mathematical analysis is an asymptotic expansion for the neural network's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Image and Signal Denoising Methods
MethodsTest
