Efficient Approximations of the Fisher Matrix in Neural Networks using Kronecker Product Singular Value Decomposition
Abdoulaye Koroko (IFPEN), Ani Anciaux-Sedrakian (IFPEN), Ibtihel Ben, Gharbia (IFPEN), Val\'erie Gar\`es (IRMAR), Mounir Haddou (IRMAR), Quang Huy, Tran (IFPEN)

TL;DR
This paper introduces new methods for approximating the Fisher Information Matrix in neural networks using Kronecker product SVD, leading to more accurate and faster optimization than existing approaches like KFAC.
Contribution
The paper proposes four novel Kronecker product SVD-based methods that improve FIM approximation accuracy and optimization speed over KFAC with minimal additional cost.
Findings
More accurate FIM approximations demonstrated on auto-encoder benchmarks.
Outperforms KFAC in optimization speed.
Achieves better results than state-of-the-art first-order methods.
Abstract
Several studies have shown the ability of natural gradient descent to minimize the objective function more efficiently than ordinary gradient descent based methods. However, the bottleneck of this approach for training deep neural networks lies in the prohibitive cost of solving a large dense linear system corresponding to the Fisher Information Matrix (FIM) at each iteration. This has motivated various approximations of either the exact FIM or the empirical one. The most sophisticated of these is KFAC, which involves a Kronecker-factored block diagonal approximation of the FIM. With only a slight additional cost, a few improvements of KFAC from the standpoint of accuracy are proposed. The common feature of the four novel methods is that they rely on a direct minimization problem, the solution of which can be computed via the Kronecker product singular value decomposition technique.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and ELM · Face and Expression Recognition
MethodsNatural Gradient Descent
