Variance Reduction on General Adaptive Stochastic Mirror Descent
Wenjie Li, Zhanyu Wang, Yichen Zhang, Guang Cheng

TL;DR
This paper introduces SVRAMD, a variance reduction framework for adaptive mirror descent algorithms, improving convergence rates in nonsmooth nonconvex optimization and validating results through deep learning experiments.
Contribution
It proposes a generalized variance reduction framework for adaptive mirror descent, applicable to various algorithms including AdaGrad and RMSProp, with proven convergence improvements.
Findings
Variance reduction decreases SFO complexity and accelerates convergence.
SVRAMD achieves the best known rates for non-adaptive algorithms.
Experimental results in deep learning support theoretical claims.
Abstract
In this work, we investigate the idea of variance reduction by studying its properties with general adaptive mirror descent algorithms in nonsmooth nonconvex finite-sum optimization problems. We propose a simple yet generalized framework for variance reduced adaptive mirror descent algorithms named SVRAMD and provide its convergence analysis in both the nonsmooth nonconvex problem and the P-L conditioned problem. We prove that variance reduction reduces the SFO complexity of adaptive mirror descent algorithms and thus accelerates their convergence. In particular, our general theory implies that variance reduction can be applied to algorithms using time-varying step sizes and self-adaptive algorithms such as AdaGrad and RMSProp. Moreover, the convergence rates of SVRAMD recover the best existing rates of non-adaptive variance reduced mirror descent algorithms without complicated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Machine Learning and ELM
MethodsAdaGrad · RMSProp
