Feedback Gradient Descent: Efficient and Stable Optimization with   Orthogonality for DNNs

Fanchen Bu; Dong Eui Chang

arXiv:2205.08385·cs.LG·July 12, 2022

Feedback Gradient Descent: Efficient and Stable Optimization with Orthogonality for DNNs

Fanchen Bu, Dong Eui Chang

PDF

1 Repo 1 Video

TL;DR

This paper introduces Feedback Gradient Descent (FGD), a novel optimization method for deep neural networks that achieves both high efficiency and stability by enforcing orthogonality through a simple discretization of a dynamical system.

Contribution

FGD is the first method to simultaneously ensure efficiency and stability in orthogonal DNN training using a dynamical system approach on the Stiefel manifold.

Findings

01

FGD outperforms existing methods in accuracy.

02

FGD demonstrates superior efficiency in training.

03

FGD provides enhanced stability during optimization.

Abstract

The optimization with orthogonality has been shown useful in training deep neural networks (DNNs). To impose orthogonality on DNNs, both computational efficiency and stability are important. However, existing methods utilizing Riemannian optimization or hard constraints can only ensure stability while those using soft constraints can only improve efficiency. In this paper, we propose a novel method, named Feedback Gradient Descent (FGD), to our knowledge, the first work showing high efficiency and stability simultaneously. FGD induces orthogonality based on the simple yet indispensable Euler discretization of a continuous-time dynamical system on the tangent bundle of the Stiefel manifold. In particular, inspired by a numerical integration method on manifolds called Feedback Integrators, we propose to instantiate it on the tangent bundle of the Stiefel manifold for the first time. In…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bokveizen/Feedback-Gradient-Descent
pytorchOfficial

Videos

Feedback Gradient Descent: Efficient and Stable Optimization with Orthogonality for DNNs· underline