Group Whitening: Balancing Learning Efficiency and Representational Capacity
Lei Huang, Yi Zhou, Li Liu, Fan Zhu, Ling Shao

TL;DR
This paper introduces group whitening (GW), a technique that combines the benefits of whitening and normalization, improving deep learning model performance and capacity, with demonstrated gains on ImageNet and COCO benchmarks.
Contribution
The paper proposes group whitening (GW), a novel method that balances learning efficiency and representational capacity, with theoretical analysis and practical validation on major benchmarks.
Findings
GW improves top-1 accuracy on ImageNet by up to 1.49%.
GW enhances bounding box AP on COCO by up to 3.21%.
Theoretical analysis links batch size to model capacity and performance.
Abstract
Batch normalization (BN) is an important technique commonly incorporated into deep learning models to perform standardization within mini-batches. The merits of BN in improving a model's learning efficiency can be further amplified by applying whitening, while its drawbacks in estimating population statistics for inference can be avoided through group normalization (GN). This paper proposes group whitening (GW), which exploits the advantages of the whitening operation and avoids the disadvantages of normalization within mini-batches. In addition, we analyze the constraints imposed on features by normalization, and show how the batch size (group number) affects the performance of batch (group) normalized networks, from the perspective of model's representational capacity. This analysis provides theoretical guidance for applying GW in practice. Finally, we apply the proposed GW to ResNet…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Machine Learning and Data Classification
Methods1x1 Convolution · Batch Normalization · ResNeXt Block · Residual Connection · *Communicated@Fast*How Do I Communicate to Expedia? · Residual Block · Convolution · Bottleneck Residual Block · Grouped Convolution · ResNeXt
