Convergence of the momentum method for semialgebraic functions with locally Lipschitz gradients
C\'edric Josz, Lexiao Lai, Xiaopeng Li

TL;DR
This paper introduces a new convergence analysis for the momentum method applied to semialgebraic functions with locally Lipschitz gradients, enabling convergence guarantees without traditional restrictive assumptions.
Contribution
It develops a novel length formula that ensures convergence of the momentum method under weaker conditions, extending its applicability to practical problems like PCA and neural networks.
Findings
First convergence guarantee for momentum method from arbitrary initial points
Applicable to PCA, matrix sensing, and neural networks
Does not require global Lipschitz or coercivity assumptions
Abstract
We propose a new length formula that governs the iterates of the momentum method when minimizing differentiable semialgebraic functions with locally Lipschitz gradients. It enables us to establish local convergence, global convergence, and convergence to local minimizers without assuming global Lipschitz continuity of the gradient, coercivity, and a global growth condition, as is done in the literature. As a result, we provide the first convergence guarantee of the momentum method starting from arbitrary initial points when applied to principal component analysis, matrix sensing, and linear neural networks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMatrix Theory and Algorithms · Advanced Numerical Methods in Computational Mathematics · Numerical methods in inverse problems
