Towards Faster Training of Global Covariance Pooling Networks by Iterative Matrix Square Root Normalization
Peihua Li, Jiangtao Xie, Qilong Wang, Zilin Gao

TL;DR
This paper introduces an iterative matrix square root normalization method for global covariance pooling in CNNs, significantly speeding up training and achieving state-of-the-art results on large-scale and fine-grained image benchmarks.
Contribution
It proposes a novel iterative normalization technique using a meta-layer with nonlinear structured layers, enabling faster GPU-compatible training of covariance pooling networks.
Findings
Faster training compared to eigendecomposition-based methods.
Achieves competitive performance on ImageNet.
Establishes state-of-the-art results on fine-grained benchmarks.
Abstract
Global covariance pooling in convolutional neural networks has achieved impressive improvement over the classical first-order pooling. Recent works have shown matrix square root normalization plays a central role in achieving state-of-the-art performance. However, existing methods depend heavily on eigendecomposition (EIG) or singular value decomposition (SVD), suffering from inefficient training due to limited support of EIG and SVD on GPU. Towards addressing this problem, we propose an iterative matrix square root normalization method for fast end-to-end training of global covariance pooling networks. At the core of our method is a meta-layer designed with loop-embedded directed graph structure. The meta-layer consists of three consecutive nonlinear structured layers, which perform pre-normalization, coupled matrix iteration and post-compensation, respectively. Our method is much…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Machine Learning and ELM
MethodsMatrix-power Normalization
