Generalized-Smooth Nonconvex Optimization is As Efficient As Smooth Nonconvex Optimization
Ziyi Chen, Yi Zhou, Yingbin Liang, Zhaosong Lu

TL;DR
This paper introduces a new class of generalized-smooth nonconvex functions and demonstrates that optimization algorithms can solve these problems as efficiently as smooth nonconvex problems, broadening the scope of effective optimization methods.
Contribution
The paper proposes a novel $oldsymbol{ extit{ extalpha}}$-symmetric generalized-smoothness concept, analyzes its properties, and develops algorithms with optimal complexity for this class of nonconvex problems.
Findings
Normalized gradient descent achieves $oldsymbol{ extO}( extepsilon^{-2})$ iteration complexity.
SPIDER algorithm attains $oldsymbol{ extO}( extepsilon^{-3})$ sample complexity.
Generalized-smooth problems are as efficiently solvable as smooth problems.
Abstract
Various optimal gradient-based algorithms have been developed for smooth nonconvex optimization. However, many nonconvex machine learning problems do not belong to the class of smooth functions and therefore the existing algorithms are sub-optimal. Instead, these problems have been shown to satisfy certain generalized-smooth conditions, which have not been well understood in the existing literature. In this paper, we propose a notion of -symmetric generalized-smoothness that extends the existing notions and covers many important functions such as high-order polynomials and exponential functions. We study the fundamental properties and establish descent lemmas for the functions in this class. Then, to solve such a large class of nonconvex problems, we design a special deterministic normalized gradient descent algorithm that achieves the optimal iteration complexity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Machine Learning and ELM
