Dual Averaging Converges for Nonconvex Smooth Stochastic Optimization
Tuo Liu, El Mehdi Saad, Wojciech Kot{\l}owski, Francesco Orabona

TL;DR
This paper proves that stochastic dual averaging (SDA) converges for non-convex smooth stochastic optimization, matching the convergence rates of SGD, and introduces ADA-DA, a variant that adapts auto-scaling without prior noise knowledge.
Contribution
It provides the first complete convergence analysis of SDA in non-convex stochastic settings, bridging a significant theoretical gap.
Findings
SDA converges at a rate of O(1/T + σ log T/√T) in non-convex smooth stochastic optimization.
A reduction technique shows SDA as SGD applied to implicitly regularized objectives.
ADA-DA achieves the same convergence rate without needing noise variance knowledge.
Abstract
Dual averaging and gradient descent with their stochastic variants stand as the two canonical recipe books for first-order optimization: Every modern variant can be viewed as a descendant of one or the other. In the convex regime, these algorithms have been deeply studied, and we know that they are essentially equivalent in terms of theoretical guarantees. On the other hand, in the non-convex setting, the situation is drastically different: While we know that SGD can minimize the gradient of non-convex smooth functions, no finite-time complexity guarantee for Stochastic Dual Averaging (SDA) was known in the same setting. In this paper, we close this gap by a reduction that views SDA as SGD applied to a sequence of implicitly regularized objectives. We show that a tuned SDA exhibits a rate of convergence , similar to that of SGD under the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic processes and financial applications · Risk and Portfolio Optimization
