Accelerating Natural Gradient with Higher-Order Invariance
Yang Song, Jiaming Song, and Stefano Ermon

TL;DR
This paper enhances the natural gradient method by employing higher-order integrators and geodesic corrections, improving invariance and efficiency in optimization for deep learning and reinforcement learning tasks.
Contribution
It introduces a novel approach combining Riemannian geometry and numerical methods to increase invariance and efficiency of natural gradient-based optimization.
Findings
Higher-order integrators improve invariance in natural gradient methods.
Geodesic corrections lead to faster convergence in neural network training.
Proposed techniques are computationally efficient and outperform traditional natural gradient methods.
Abstract
An appealing property of the natural gradient is that it is invariant to arbitrary differentiable reparameterizations of the model. However, this invariance property requires infinitesimal steps and is lost in practical implementations with small but finite step sizes. In this paper, we study invariance properties from a combined perspective of Riemannian geometry and numerical differential equation solving. We define the order of invariance of a numerical method to be its convergence order to an invariant solution. We propose to use higher-order integrators and geodesic corrections to obtain more invariant optimization trajectories. We prove the numerical convergence properties of geodesic corrected updates and show that they can be as computationally efficient as plain natural gradient. Experimentally, we demonstrate that invariance leads to faster optimization and our techniques…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Fluid Dynamics and Turbulent Flows · Gas Dynamics and Kinetic Theory
