Normalization effects on shallow neural networks and related asymptotic expansions
Jiahui Yu, Konstantinos Spiliopoulos

TL;DR
This paper analyzes how different normalization schemes in shallow neural networks affect their performance, providing asymptotic expansions and showing that bias and variance decrease with more hidden units, with empirical validation on standard datasets.
Contribution
It introduces a mathematical framework for understanding the impact of normalization on shallow networks, bridging the gap between different scaling regimes and providing asymptotic expansions.
Findings
Bias and variance decrease as hidden units increase.
Variance reduces as normalization approaches mean-field regime.
Test and train accuracy improve with mean-field normalization.
Abstract
We consider shallow (single hidden layer) neural networks and characterize their performance when trained with stochastic gradient descent as the number of hidden units and gradient descent steps grow to infinity. In particular, we investigate the effect of different scaling schemes, which lead to different normalizations of the neural network, on the network's statistical output, closing the gap between the and the mean-field normalization. We develop an asymptotic expansion for the neural network's statistical output pointwise with respect to the scaling parameter as the number of hidden units grows to infinity. Based on this expansion, we demonstrate mathematically that to leading order in , there is no bias-variance trade off, in that both bias and variance (both explicitly characterized) decrease as the number of hidden units increases and time grows. In…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Model Reduction and Neural Networks · Neural Networks and Applications
