Stochastic Whitening Batch Normalization
Shengdong Zhang, Ehsan Nezhadarya, Homa Fashandi, Jiayi Liu, Darin, Graham, Mohak Shah

TL;DR
This paper introduces Stochastic Whitening Batch Normalization (SWBN), an efficient online method for whitening activations in deep neural networks that improves convergence and generalization with less computational cost than existing methods.
Contribution
The paper proposes SWBN, a novel online whitening technique that shares information across training steps, enhancing DNN training efficiency and performance over previous methods like IterNorm.
Findings
SWBN accelerates convergence of DNNs.
SWBN improves generalization in image classification.
SWBN has lower computational overhead than IterNorm.
Abstract
Batch Normalization (BN) is a popular technique for training Deep Neural Networks (DNNs). BN uses scaling and shifting to normalize activations of mini-batches to accelerate convergence and improve generalization. The recently proposed Iterative Normalization (IterNorm) method improves these properties by whitening the activations iteratively using Newton's method. However, since Newton's method initializes the whitening matrix independently at each training step, no information is shared between consecutive steps. In this work, instead of exact computation of whitening matrix at each time step, we estimate it gradually during training in an online fashion, using our proposed Stochastic Whitening Batch Normalization (SWBN) algorithm. We show that while SWBN improves the convergence rate and generalization of DNNs, its computational overhead is less than that of IterNorm. Due to the high…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Machine Learning and Data Classification
MethodsBatch Normalization
