Normalization effects on deep neural networks

Jiahui Yu; Konstantinos Spiliopoulos

arXiv:2209.01018·cs.LG·September 5, 2022

Normalization effects on deep neural networks

Jiahui Yu, Konstantinos Spiliopoulos

PDF

Open Access 1 Repo

TL;DR

This paper investigates how different normalization schemes, especially mean-field scaling, affect the statistical behavior and test accuracy of deep neural networks, providing insights for hyperparameter tuning.

Contribution

It offers a mathematical analysis of normalization effects, demonstrating that mean-field scaling optimizes variance and accuracy, and guides hyperparameter selection.

Findings

01

Mean-field scaling (γ=1) yields optimal variance and accuracy.

02

Outer layer scaling has a greater impact on network behavior.

03

Mathematical analysis informs systematic hyperparameter choice.

Abstract

We study the effect of normalization on the layers of deep neural networks of feed-forward type. A given layer $i$ with $N_{i}$ hidden units is allowed to be normalized by $1/ N_{i}^{γ_{i}}$ with $γ_{i} \in [1/2, 1]$ and we study the effect of the choice of the $γ_{i}$ on the statistical behavior of the neural network's output (such as variance) as well as on the test accuracy on the MNIST data set. We find that in terms of variance of the neural network's output and test accuracy the best choice is to choose the $γ_{i}$ 's to be equal to one, which is the mean-field scaling. We also find that this is particularly true for the outer layer, in that the neural network's behavior is more sensitive in the scaling of the outer layer as opposed to the scaling of the inner layers. The mechanism for the mathematical analysis is an asymptotic expansion for the neural network's…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kspiliopoulos/NENN_Deep
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Image and Signal Denoising Methods

MethodsTest