A Novel Structured Natural Gradient Descent for Deep Learning
Weihua Liu, Xiabi Liu

TL;DR
This paper introduces a new structured natural gradient descent method that reconstructs neural networks to approximate natural gradient optimization, improving convergence and performance while maintaining computational efficiency.
Contribution
It proposes reconstructing neural network structures to emulate natural gradient descent, offering a practical alternative that enhances training speed and accuracy.
Findings
Accelerates convergence of deep networks
Achieves better performance than traditional gradient descent
Maintains computational simplicity
Abstract
Natural gradient descent (NGD) provided deep insights and powerful tools to deep neural networks. However the computation of Fisher information matrix becomes more and more difficult as the network structure turns large and complex. This paper proposes a new optimization method whose main idea is to accurately replace the natural gradient optimization by reconstructing the network. More specifically, we reconstruct the structure of the deep neural network, and optimize the new network using traditional gradient descent (GD). The reconstructed network achieves the effect of the optimization way with natural gradient descent. Experimental results show that our optimization method can accelerate the convergence of deep network models and achieve better performance than GD while sharing its computational simplicity.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Blind Source Separation Techniques · Domain Adaptation and Few-Shot Learning
