Tame Riemannian Stochastic Approximation
Johannes Aspman, Vyacheslav Kungurtsev, Reza Roohi Seraji

TL;DR
This paper investigates stochastic approximation methods for tame, non-differentiable functions constrained on Riemannian manifolds, providing theoretical guarantees and demonstrating the effectiveness of a Riemannian stochastic sub-gradient descent approach.
Contribution
It is the first to analyze tame optimization on Riemannian manifolds, linking geometric structure with stochastic sub-gradient descent convergence guarantees.
Findings
Expected function decrease and convergence are theoretically guaranteed.
Numerical experiments validate the Retracted SGD algorithm.
Tame functions model deep neural network loss landscapes effectively.
Abstract
We study the properties of stochastic approximation applied to a tame nondifferentiable function subject to constraints defined by a Riemannian manifold. The objective landscape of tame functions, arising in o-minimal topology extended to a geometric category when generalized to manifolds, exhibits some structure that enables theoretical guarantees of expected function decrease and asymptotic convergence for generic stochastic sub-gradient descent. Recent work has shown that this class of functions faithfully model the loss landscape of deep neural network training objectives, and the autograd operation used in deep learning packages implements a variant of subgradient descent with the correct properties for convergence. Riemannian optimization uses geometric properties of a constraint set to perform a minimization procedure while enforcing adherence to the the optimization variable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Topological and Geometric Data Analysis · Model Reduction and Neural Networks
