Invariance Properties of the Natural Gradient in Overparametrised   Systems

Jesse van Oostrum; Johannes M\"uller; Nihat Ay

arXiv:2206.15273·cs.LG·July 1, 2022

Invariance Properties of the Natural Gradient in Overparametrised Systems

Jesse van Oostrum, Johannes M\"uller, Nihat Ay

PDF

TL;DR

This paper investigates the invariance properties of the natural gradient in overparametrised systems, focusing on when the natural parameter gradient aligns with the natural gradient's pushforward, providing insights into its geometric behavior.

Contribution

It analyzes the conditions under which the natural parameter gradient equals the natural gradient's pushforward in overparametrised models, highlighting key invariance properties.

Findings

01

Identifies conditions for the equality of natural parameter gradient and natural gradient pushforward.

02

Provides theoretical insights into the invariance properties of the natural gradient.

03

Enhances understanding of natural gradient behavior in overparametrised systems.

Abstract

The natural gradient field is a vector field that lives on a model equipped with a distinguished Riemannian metric, e.g. the Fisher-Rao metric, and represents the direction of steepest ascent of an objective function on the model with respect to this metric. In practice, one tries to obtain the corresponding direction on the parameter space by multiplying the ordinary gradient by the inverse of the Gram matrix associated with the metric. We refer to this vector on the parameter space as the natural parameter gradient. In this paper we study when the pushforward of the natural parameter gradient is equal to the natural gradient. Furthermore we investigate the invariance properties of the natural parameter gradient. Both questions are addressed in an overparametrised setting.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.