WeightAlign: Normalizing Activations by Weight Alignment
Xiangwei Shi, Yunqiang Li, Xin Liu, Jan van Gemert

TL;DR
WeightAlign introduces a novel weight normalization technique that stabilizes activation normalization across various batch sizes without relying on sample statistics, enhancing deep network training stability.
Contribution
The paper proposes WeightAlign, a weight-based normalization method that is independent of batch size and can be combined with existing activation normalization techniques.
Findings
WeightAlign improves training stability across small and large batch sizes.
The method enhances performance on classification, segmentation, and domain adaptation tasks.
Experimental results show consistent gains over traditional normalization methods.
Abstract
Batch normalization (BN) allows training very deep networks by normalizing activations by mini-batch sample statistics which renders BN unstable for small batch sizes. Current small-batch solutions such as Instance Norm, Layer Norm, and Group Norm use channel statistics which can be computed even for a single sample. Such methods are less stable than BN as they critically depend on the statistics of a single input sample. To address this problem, we propose a normalization of activation without sample statistics. We present WeightAlign: a method that normalizes the weights by the mean and scaled standard derivation computed within a filter, which normalizes activations without computing any sample statistics. Our proposed method is independent of batch size and stable over a wide range of batch sizes. Because weight statistics are orthogonal to sample statistics, we can directly combine…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Neural Network Applications
