The Lingering of Gradients: Theory and Applications

Zeyuan Allen-Zhu; David Simchi-Levi; Xinshang Wang

arXiv:1901.02871·math.OC·May 29, 2019

The Lingering of Gradients: Theory and Applications

Zeyuan Allen-Zhu, David Simchi-Levi, Xinshang Wang

PDF

Open Access

TL;DR

This paper introduces a refined analysis of gradient-based methods by considering the lingering effect of gradients, leading to faster convergence rates and improved practical performance in large-scale optimization tasks.

Contribution

It develops a theoretical framework for gradient lingering, demonstrating improved convergence rates and applying it to real-world large-scale problems.

Findings

01

Gradient descent convergence rate improved from 1/T to exp(-T^{1/3})

02

Achieved high-accuracy solutions on large-scale datasets with fewer passes

03

Enhanced SVM performance by two orders of magnitude over existing algorithms

Abstract

Classically, the time complexity of a first-order method is estimated by its number of gradient computations. In this paper, we study a more refined complexity by taking into account the `lingering' of gradients: once a gradient is computed at $x_{k}$ , the additional time to compute gradients at $x_{k + 1}, x_{k + 2}, \dots$ may be reduced. We show how this improves the running time of several first-order methods. For instance, if the `additional time' scales linearly with respect to the traveled distance, then the `convergence rate' of gradient descent can be improved from $1/ T$ to $exp (- T^{1/3})$ . On the application side, we solve a hypothetical revenue management problem on the Yahoo! Front Page Today Module with 4.6m users to $1 0^{- 6}$ error using only 6 passes of the dataset; and solve a real-life support vector machine problem to an accuracy that is two orders of magnitude better…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsOptimization and Search Problems · Stochastic Gradient Optimization Techniques · Advanced Bandit Algorithms Research