Geodesic Gradient Descent: A Generic and Learning-rate-free Optimizer on Objective Function-induced Manifolds
Liwei Hu, Guangyao Li, Wenyong Wang, Xiaoming Zhang, Yu Xiang

TL;DR
The paper introduces geodesic gradient descent (GGD), a novel Riemannian optimization method that adapts to complex hypersurface geometries without requiring a learning rate, improving training performance on neural networks.
Contribution
GGD is a generic, learning-rate-free Riemannian optimizer that approximates local geometry with spheres, ensuring updates stay on the hypersurface and enhancing training accuracy.
Findings
GGD reduces test MSE by up to 48.76% on Burgers' dataset.
GGD decreases cross-entropy loss by up to 11.59% on MNIST.
Eliminates the need for a learning rate in gradient updates.
Abstract
Euclidean gradient descent algorithms barely capture the geometry of objective function-induced hypersurfaces and risk driving update trajectories off the hypersurfaces. Riemannian gradient descent algorithms address these issues but fail to represent complex hypersurfaces via a single classic manifold. We propose geodesic gradient descent (GGD), a generic and learning-rate-free Riemannian gradient descent algorithm. At each iteration, GGD uses an n-dimensional sphere to approximate a local neighborhood on the objective function-induced hypersurface, adapting to arbitrarily complex geometries. A tangent vector derived from the Euclidean gradient is projected onto the sphere to form a geodesic, ensuring the update trajectory stays on the hypersurface. Parameter updates are performed using the endpoint of the geodesic. The maximum step size of the gradient in GGD is equal to a quarter of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Advanced Numerical Analysis Techniques · Stochastic Gradient Optimization Techniques
