Reconstructing Deep Neural Networks: Unleashing the Optimization   Potential of Natural Gradient Descent

Weihua Liu; Said Boumaraf; Jianwu Li; Chaochao Lin; Xiabi Liu; Lijuan; Niu; Naoufel Werghi

arXiv:2412.07441·cs.LG·December 11, 2024

Reconstructing Deep Neural Networks: Unleashing the Optimization Potential of Natural Gradient Descent

Weihua Liu, Said Boumaraf, Jianwu Li, Chaochao Lin, Xiabi Liu, Lijuan, Niu, Naoufel Werghi

PDF

Open Access 1 Repo

TL;DR

This paper introduces structured natural gradient descent (SNGD), a novel method that enhances the efficiency of natural gradient descent for training deep neural networks by decomposing Fisher information matrix calculations.

Contribution

The paper proposes SNGD, which transforms NGD into a form equivalent to gradient descent on a reconstructed network, improving computational efficiency and scalability.

Findings

01

SNGD converges faster than traditional NGD.

02

SNGD outperforms standard gradient descent in efficiency and effectiveness.

03

The method maintains comparable solution quality to NGD.

Abstract

Natural gradient descent (NGD) is a powerful optimization technique for machine learning, but the computational complexity of the inverse Fisher information matrix limits its application in training deep neural networks. To overcome this challenge, we propose a novel optimization method for training deep neural networks called structured natural gradient descent (SNGD). Theoretically, we demonstrate that optimizing the original network using NGD is equivalent to using fast gradient descent (GD) to optimize the reconstructed network with a structural transformation of the parameter matrix. Thereby, we decompose the calculation of the global Fisher information matrix into the efficient computation of local Fisher matrices via constructing local Fisher layers in the reconstructed network to speed up the training. Experimental results on various deep networks and datasets demonstrate that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

chaochao-lin/sngd
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsNatural Gradient Descent · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings