Riemannian stochastic optimization methods avoid strict saddle points
Ya-Ping Hsieh, Mohammad Reza Karimi, Andreas Krause and, Panayotis Mertikopoulos

TL;DR
This paper proves that stochastic Riemannian optimization algorithms almost surely avoid strict saddle points, ensuring convergence to local minima in non-convex manifold problems common in machine learning.
Contribution
It establishes that a broad class of retraction-based stochastic Riemannian methods avoid strict saddle points with probability one, under mild assumptions.
Findings
Algorithms avoid strict saddle points with probability 1
Results apply to natural policy gradient and mirror descent methods
Guarantee convergence to local minima in non-convex Riemannian problems
Abstract
Many modern machine learning applications - from online principal component analysis to covariance matrix identification and dictionary learning - can be formulated as minimization problems on Riemannian manifolds, and are typically solved with a Riemannian stochastic gradient method (or some variant thereof). However, in many cases of interest, the resulting minimization problem is not geodesically convex, so the convergence of the chosen solver to a desirable solution - i.e., a local minimizer - is by no means guaranteed. In this paper, we study precisely this question, that is, whether stochastic Riemannian optimization algorithms are guaranteed to avoid saddle points with probability 1. For generality, we study a family of retraction-based methods which, in addition to having a potentially much lower per-iteration cost relative to Riemannian gradient descent, include other widely…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsStatistical Methods and Inference · Stochastic Gradient Optimization Techniques · Gaussian Processes and Bayesian Inference
