Batch normalization does not improve initialization
Joris Dannemann, Gero Junike

TL;DR
This paper challenges the common belief that batch normalization enhances neural network initialization, providing a counterexample that demonstrates it does not improve initialization as previously claimed.
Contribution
The paper presents a counterexample to disprove the claim that batch normalization improves neural network initialization.
Findings
Counterexample shows batch normalization does not improve initialization
Challenges prior theoretical claims about batch normalization's role in initialization
Highlights the need to reconsider the theoretical understanding of batch normalization
Abstract
Batch normalization is one of the most important regularization techniques for neural networks, significantly improving training by centering the layers of the neural network. There have been several attempts to provide a theoretical justification for batch ormalization. Santurkar and Tsipras (2018) [How does batch normalization help optimization? Advances in neural information rocessing systems, 31] claim that batch normalization improves initialization. We provide a counterexample showing that this claim s not true, i.e., batch normalization does not improve initialization.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems · VLSI and Analog Circuit Testing · Optimal Experimental Design Methods
MethodsBatch Normalization
