SARAH: A Novel Method for Machine Learning Problems Using Stochastic   Recursive Gradient

Lam M. Nguyen; Jie Liu; Katya Scheinberg; Martin Tak\'a\v{c}

arXiv:1703.00102·stat.ML·September 8, 2017·81 cites

SARAH: A Novel Method for Machine Learning Problems Using Stochastic Recursive Gradient

Lam M. Nguyen, Jie Liu, Katya Scheinberg, Martin Tak\'a\v{c}

PDF

Open Access

TL;DR

SARAH introduces a recursive stochastic gradient method that achieves linear convergence for finite-sum minimization problems without needing to store past gradients, showing improved efficiency over existing methods.

Contribution

The paper presents SARAH, a new recursive stochastic gradient algorithm with proven linear convergence and a practical variant SARAH+ that outperforms existing stochastic methods.

Findings

01

SARAH achieves linear convergence under strong convexity.

02

SARAH+ demonstrates practical efficiency in experiments.

03

SARAH does not require storing past gradients unlike SAG/SAGA.

Abstract

In this paper, we propose a StochAstic Recursive grAdient algoritHm (SARAH), as well as its practical variant SARAH+, as a novel approach to the finite-sum minimization problems. Different from the vanilla SGD and other modern stochastic methods such as SVRG, S2GD, SAG and SAGA, SARAH admits a simple recursive framework for updating stochastic gradient estimates; when comparing to SAG/SAGA, SARAH does not require a storage of past gradients. The linear convergence rate of SARAH is proven under strong convexity assumption. We also prove a linear convergence rate (in the strongly convex case) for an inner loop of SARAH, the property that SVRG does not possess. Numerical experiments demonstrate the efficiency of our algorithm.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Markov Chains and Monte Carlo Methods

MethodsSAGA · Stochastic Gradient Descent