Sharp Analysis of Stochastic Optimization under Global Kurdyka-{\L}ojasiewicz Inequality
Ilyas Fatkhullin, Jalal Etesami, Niao He, Negar Kiyavash

TL;DR
This paper analyzes the complexity of stochastic optimization under the Kurdyka-Lojasiewicz inequality, proposing algorithms with optimal sample complexities for certain conditions, including applications in reinforcement learning.
Contribution
It introduces a general framework for analyzing SGD under KL conditions and proposes a modified SGD with variance reduction that achieves optimal sample complexity.
Findings
Sample complexity of SGD under $eta$-PL condition: $ ilde{O}(rac{1}{ ext{epsilon}^{(4-eta)/eta}})$
Modified SGD with variance reduction achieves $ ilde{O}(rac{1}{ ext{epsilon}^{2/eta}})$ complexity
First optimal algorithm for $eta=1$ case in applications like policy optimization
Abstract
We study the complexity of finding the global solution to stochastic nonconvex optimization when the objective function satisfies global Kurdyka-Lojasiewicz (KL) inequality and the queries from stochastic gradient oracles satisfy mild expected smoothness assumption. We first introduce a general framework to analyze Stochastic Gradient Descent (SGD) and its associated nonlinear dynamics under the setting. As a byproduct of our analysis, we obtain a sample complexity of for SGD when the objective satisfies the so called -PL condition, where is the degree of gradient domination. Furthermore, we show that a modified SGD with variance reduction and restarting (PAGER) achieves an improved sample complexity of when the objective satisfies the average smoothness assumption. This leads to the first…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Advanced Bandit Algorithms Research · Sparse and Compressive Sensing Techniques
