Adaptive Stochastic Variance Reduction for Non-convex Finite-Sum   Minimization

Ali Kavis; Stratis Skoulakis; Kimon Antonakopoulos; Leello Tadesse; Dadi; Volkan Cevher

arXiv:2211.01851·math.OC·November 4, 2022·1 cites

Adaptive Stochastic Variance Reduction for Non-convex Finite-Sum Minimization

Ali Kavis, Stratis Skoulakis, Kimon Antonakopoulos, Leello Tadesse, Dadi, Volkan Cevher

PDF

Open Access 1 Video

TL;DR

AdaSpider is a novel adaptive variance-reduction algorithm for non-convex finite-sum optimization that does not require prior knowledge of problem parameters and achieves optimal oracle complexity.

Contribution

It introduces AdaSpider, the first parameter-free non-convex variance-reduction method with optimal complexity bounds.

Findings

01

Achieves $ ilde{O}(n + rac{ oot{n}}{ ext{epsilon}^2})$ oracle calls for $ ext{epsilon}$-stationary points.

02

Does not require knowledge of smoothness constant, target accuracy, or gradient bounds.

03

Matches the lower bound complexity up to logarithmic factors.

Abstract

We propose an adaptive variance-reduction method, called AdaSpider, for minimization of $L$ -smooth, non-convex functions with a finite-sum structure. In essence, AdaSpider combines an AdaGrad-inspired [Duchi et al., 2011, McMahan & Streeter, 2010], but a fairly distinct, adaptive step-size schedule with the recursive stochastic path integrated estimator proposed in [Fang et al., 2018]. To our knowledge, Adaspider is the first parameter-free non-convex variance-reduction method in the sense that it does not require the knowledge of problem-dependent parameters, such as smoothness constant $L$ , target accuracy $ϵ$ or any bound on gradient norms. In doing so, we are able to compute an $ϵ$ -stationary point with $\tilde{O} (n + n / ϵ^{2})$ oracle-calls, which matches the respective lower bound up to logarithmic factors.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Adaptive Stochastic Variance Reduction for Non-convex Finite-Sum Minimization· slideslive

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Machine Learning and Algorithms