Avoidance of non-strict saddle points by blow-up
El Mehdi Achour, Umberto L. Hryniewicz, Michael Westdickenberg

TL;DR
This paper introduces a method to analyze non-strict saddle points in optimization by applying a nonlinear rescaling, or 'blow-up', to reveal the higher-order geometry and understand gradient flow behavior.
Contribution
It proposes a novel blow-up technique to study the local geometry of non-strict saddle points, extending classical results to degenerate cases.
Findings
Reveals higher-order structure near non-strict saddle points
Provides insights into gradient flow trajectories avoiding saddle points
Extends classical non-degenerate saddle point analysis
Abstract
It is an old idea to use gradient flows or time-discretized variants thereof as methods for solving minimization problems. In some applications, for example in machine learning contexts, it is important to know that for generic initial data, gradient flow trajectories do not get stuck at saddle points. There are classical results concerned with the non-degenerate situation. But if the Hessian of the objective function has a non-trivial kernel at the critical point, then these results are inconclusive in general. In this paper, we show how relevant information can be extracted by ``blowing up'' the objective function around the non-strict saddle point, i.e., by a suitable non-linear rescaling that makes the higher order geometry visible.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Stochastic Gradient Optimization Techniques · Topological and Geometric Data Analysis
