Stop Wasting My Gradients: Practical SVRG

Reza Babanezhad; Mohamed Osama Ahmed; Alim Virani; Mark Schmidt; Jakub; Kone\v{c}n\'y; Scott Sallinen

arXiv:1511.01942·cs.LG·August 6, 2016·44 cites

Stop Wasting My Gradients: Practical SVRG

Reza Babanezhad, Mohamed Osama Ahmed, Alim Virani, Mark Schmidt, Jakub, Kone\v{c}n\'y, Scott Sallinen

PDF

Open Access

TL;DR

This paper introduces practical improvements to the SVRG optimization method, including strategies for reducing gradient computations and enhancing convergence, making it more efficient for large-scale machine learning tasks.

Contribution

The paper proposes new variants of SVRG that incorporate decreasing error sequences, support vector exploitation, and alternative mini-batch strategies to improve efficiency and convergence.

Findings

01

Growing-batch strategies reduce early iteration gradient calculations.

02

Support vector exploitation decreases later iteration computations.

03

Regularized SVRG improves convergence rate.

Abstract

We present and analyze several strategies for improving the performance of stochastic variance-reduced gradient (SVRG) methods. We first show that the convergence rate of these methods can be preserved under a decreasing sequence of errors in the control variate, and use this to derive variants of SVRG that use growing-batch strategies to reduce the number of gradient calculations required in the early iterations. We further (i) show how to exploit support vectors to reduce the number of gradient computations in the later iterations, (ii) prove that the commonly-used regularized SVRG iteration is justified and improves the convergence rate, (iii) consider alternate mini-batch selection strategies, and (iv) consider the generalization error of the method.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Advanced Bandit Algorithms Research · Privacy-Preserving Technologies in Data