Brand New K-FACs: Speeding up K-FAC with Online Decomposition Updates
Constantin Octavian Puiu

TL;DR
This paper introduces a new online inverse update method for K-FAC that reduces computational complexity from cubic to linear in layer size, improving efficiency in natural gradient optimization for deep learning.
Contribution
It proposes a novel linear-scaling inverse update for K-FAC, applicable mainly to fully connected layers, and demonstrates its effectiveness in accelerating neural network training.
Findings
The new method reduces inverse computation complexity to linear in layer size.
Adding the update to RS-KFAC decreases inversion error with minimal overhead.
The proposed algorithms outperform RS-KFAC on CIFAR10 with VGG16_bn in terms of speed.
Abstract
K-FAC (arXiv:1503.05671, arXiv:1602.01407) is a tractable implementation of Natural Gradient (NG) for Deep Learning (DL), whose bottleneck is computing the inverses of the so-called ``Kronecker-Factors'' (K-factors). RS-KFAC (arXiv:2206.15397) is a K-FAC improvement which provides a cheap way of estimating the K-factors inverses. In this paper, we exploit the exponential-average construction paradigm of the K-factors, and use online numerical linear algebra techniques to propose an even cheaper (but less accurate) way of estimating the K-factors inverses. In particular, we propose a K-factor inverse update which scales linearly in layer size. We also propose an inverse application procedure which scales linearly as well (the one of K-FAC scales cubically and the one of RS-KFAC scales quadratically). Overall, our proposed algorithm gives an approximate K-FAC implementation whose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Imaging Techniques and Applications · Sparse and Compressive Sensing Techniques · Matrix Theory and Algorithms
MethodsTest
