Surpassing Gradient Descent Provably: A Cyclic Incremental Method with   Linear Convergence Rate

Aryan Mokhtari; Mert G\"urb\"uzbalaban; Alejandro Ribeiro

arXiv:1611.00347·math.OC·February 9, 2018

Surpassing Gradient Descent Provably: A Cyclic Incremental Method with Linear Convergence Rate

Aryan Mokhtari, Mert G\"urb\"uzbalaban, Alejandro Ribeiro

PDF

TL;DR

This paper introduces DIAG, a cyclic incremental gradient method that achieves linear convergence and outperforms gradient descent in large-scale convex optimization problems by efficiently approximating the full gradient.

Contribution

The paper proposes a novel cyclic incremental aggregated gradient method (DIAG) with linear convergence, improving upon traditional incremental methods and gradient descent.

Findings

01

DIAG converges linearly to the optimal solution.

02

The worst-case performance of DIAG surpasses that of gradient descent.

03

DIAG reduces computational cost while maintaining fast convergence.

Abstract

Recently, there has been growing interest in developing optimization methods for solving large-scale machine learning problems. Most of these problems boil down to the problem of minimizing an average of a finite set of smooth and strongly convex functions where the number of functions $n$ is large. Gradient descent method (GD) is successful in minimizing convex problems at a fast linear rate; however, it is not applicable to the considered large-scale optimization setting because of the high computational complexity. Incremental methods resolve this drawback of gradient methods by replacing the required gradient for the descent direction with an incremental gradient approximation. They operate by evaluating one gradient per iteration and executing the average of the $n$ available gradients as a gradient approximate. Although, incremental methods reduce the computational cost of GD,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.