Mirror Descent Strikes Again: Optimal Stochastic Convex Optimization under Infinite Noise Variance
Nuri Mert Vural, Lu Yu, Krishnakumar Balasubramanian, Stanislav, Volgushev, Murat A. Erdogdu

TL;DR
This paper analyzes the convergence of stochastic mirror descent in convex optimization problems with infinite noise variance, providing optimal rates and showing the algorithm's advantages without gradient clipping.
Contribution
It establishes the convergence rates of stochastic mirror descent under infinite noise variance with bounded moments, without needing gradient clipping, and proves these rates are optimal.
Findings
Convergence rates are quantified in terms of iterations and problem geometry.
The algorithm does not require gradient clipping or normalization.
Lower bounds show no other first-order method can do better.
Abstract
We study stochastic convex optimization under infinite noise variance. Specifically, when the stochastic gradient is unbiased and has uniformly bounded -th moment, for some , we quantify the convergence rate of the Stochastic Mirror Descent algorithm with a particular class of uniformly convex mirror maps, in terms of the number of iterations, dimensionality and related geometric parameters of the optimization problem. Interestingly this algorithm does not require any explicit gradient clipping or normalization, which have been extensively used in several recent empirical and theoretical works. We complement our convergence results with information-theoretic lower bounds showing that no other algorithm using only stochastic first-order oracles can achieve improved rates. Our results have several interesting consequences for devising online/streaming…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Advanced Bandit Algorithms Research
MethodsGradient Clipping
