Reconstructing Deep Neural Networks: Unleashing the Optimization Potential of Natural Gradient Descent
Weihua Liu, Said Boumaraf, Jianwu Li, Chaochao Lin, Xiabi Liu, Lijuan, Niu, Naoufel Werghi

TL;DR
This paper introduces structured natural gradient descent (SNGD), a novel method that enhances the efficiency of natural gradient descent for training deep neural networks by decomposing Fisher information matrix calculations.
Contribution
The paper proposes SNGD, which transforms NGD into a form equivalent to gradient descent on a reconstructed network, improving computational efficiency and scalability.
Findings
SNGD converges faster than traditional NGD.
SNGD outperforms standard gradient descent in efficiency and effectiveness.
The method maintains comparable solution quality to NGD.
Abstract
Natural gradient descent (NGD) is a powerful optimization technique for machine learning, but the computational complexity of the inverse Fisher information matrix limits its application in training deep neural networks. To overcome this challenge, we propose a novel optimization method for training deep neural networks called structured natural gradient descent (SNGD). Theoretically, we demonstrate that optimizing the original network using NGD is equivalent to using fast gradient descent (GD) to optimize the reconstructed network with a structural transformation of the parameter matrix. Thereby, we decompose the calculation of the global Fisher information matrix into the efficient computation of local Fisher matrices via constructing local Fisher layers in the reconstructed network to speed up the training. Experimental results on various deep networks and datasets demonstrate that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsNatural Gradient Descent · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
