Comparison of Batch Normalization and Weight Normalization Algorithms for the Large-scale Image Classification
Igor Gitman, Boris Ginsburg

TL;DR
This paper compares batch normalization and weight normalization algorithms for large-scale image classification, revealing that BN outperforms WN in final accuracy and stability on ImageNet with ResNet-50.
Contribution
The study provides a detailed empirical comparison of BN and WN algorithms on large-scale datasets, highlighting BN's superior regularization effects and stability.
Findings
BN achieves higher final test accuracy than WN.
WN algorithms are less stable and less effective for large-scale training.
BN's regularization effect is crucial for optimal performance.
Abstract
Batch normalization (BN) has become a de facto standard for training deep convolutional networks. However, BN accounts for a significant fraction of training run-time and is difficult to accelerate, since it is a memory-bandwidth bounded operation. Such a drawback of BN motivates us to explore recently proposed weight normalization algorithms (WN algorithms), i.e. weight normalization, normalization propagation and weight normalization with translated ReLU. These algorithms don't slow-down training iterations and were experimentally shown to outperform BN on relatively small networks and datasets. However, it is not clear if these algorithms could replace BN in practical, large-scale applications. We answer this question by providing a detailed comparison of BN and WN algorithms using ResNet-50 network trained on ImageNet. We found that although WN achieves better training accuracy, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Industrial Vision Systems and Defect Detection · Advanced Image and Video Retrieval Techniques
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Weight Normalization · Dropout
