Refining asymptotic complexity bounds for nonconvex optimization methods, including why steepest descent is $o(\epsilon^{-2})$ rather than $\mathcal{O}(\epsilon^{-2})$
Serge Gratton, Chee-Khian Sim, Philippe L. Toint

TL;DR
This paper refines the evaluation complexity bounds for nonconvex optimization algorithms, showing they are often asymptotically better than previously thought, especially for steepest descent, which is $o(rac{1}{\epsilon^2})$ rather than $O(rac{1}{\epsilon^2})$.
Contribution
The authors provide a refined analysis of complexity bounds, demonstrating that many algorithms have bounds of order $o(rac{1}{\epsilon^eta})$, improving upon the standard $O(rac{1}{\epsilon^eta})$ bounds.
Findings
Standard bounds are often $o(rac{1}{\epsilon^eta})$ instead of $O(rac{1}{\epsilon^eta})$.
Refined bounds apply to known algorithms, including steepest descent.
An example shows the standard and refined bounds can be very close.
Abstract
We revisit the standard ``telescoping sum'' argument ubiquitous in the final steps of analyzing evaluation complexity of algorithms for smooth nonconvex optimization, and obtain a refined formulation of the resulting bound as a function of the requested accuracy . While bounds obtained using the standard argument typically are of the form for some positive , the refined results are of the form . We then explore to which known algorithms our refined bounds are applicable and finally describe an example showing how close the standard and refined bounds can be.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Complexity and Algorithms in Graphs · Sparse and Compressive Sensing Techniques
