Convergence of First-Order Methods for Constrained Nonconvex Optimization with Dependent Data
Ahmet Alacaoglu, Hanbaek Lyu

TL;DR
This paper analyzes stochastic projected gradient methods for constrained nonconvex optimization with dependent data, achieving improved convergence rates under mild mixing conditions and extending results to various algorithms and applications.
Contribution
It provides the first convergence analysis for constrained nonconvex optimization with dependent data under mild mixing conditions, improving complexity bounds and extending to multiple algorithms.
Findings
Achieves $ ilde{O}(t^{-1/4})$ convergence rate.
Reduces complexity from $ ilde{O}( ext{} ext{varepsilon}^{-8})$ to $ ilde{O}( ext{} ext{varepsilon}^{-4})$.
Extends results to stochastic proximal gradient, AdaGrad, and heavy ball momentum.
Abstract
We focus on analyzing the classical stochastic projected gradient methods under a general dependent data sampling scheme for constrained smooth nonconvex optimization. We show the worst-case rate of convergence and complexity for achieving an -near stationary point in terms of the norm of the gradient of Moreau envelope and gradient mapping. While classical convergence guarantee requires i.i.d. data sampling from the target distribution, we only require a mild mixing condition of the conditional distribution, which holds for a wide class of Markov chain sampling algorithms. This improves the existing complexity for the constrained smooth nonconvex optimization with dependent data from to with a significantly simpler analysis. We illustrate the generality of our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Markov Chains and Monte Carlo Methods
MethodsAdaGrad
