Convergence Rate of Incremental Gradient and Newton Methods

Mert G\"urb\"uzbalaban; Asuman Ozdaglar; Pablo Parrilo

arXiv:1510.08562·math.OC·February 9, 2022

Convergence Rate of Incremental Gradient and Newton Methods

Mert G\"urb\"uzbalaban, Asuman Ozdaglar, Pablo Parrilo

PDF

TL;DR

This paper analyzes the convergence rates of incremental gradient and Newton methods for strongly convex functions, showing improved rates with diminishing stepsizes and highlighting differences in parameter tuning requirements.

Contribution

It provides new convergence rate results for incremental gradient and Newton methods under various stepsize rules, including conditions for optimal rates and their dependence on problem parameters.

Findings

01

Incremental gradient converges at rate O(1/k^s) with stepsize Θ(1/k^s).

02

Incremental Newton achieves O(1/k) rate without tuning stepsize to strong convexity.

03

Results are tight, with examples confirming the bounds.

Abstract

The incremental gradient method is a prominent algorithm for minimizing a finite sum of smooth convex functions, used in many contexts including large-scale data processing applications and distributed optimization over networks. It is a first-order method that processes the functions one at a time based on their gradient information. The incremental Newton method, on the other hand, is a second-order variant which exploits additionally the curvature information of the underlying functions and can therefore be faster. In this paper, we focus on the case when the objective function is strongly convex and present fast convergence results for the incremental gradient and incremental Newton methods under the constant and diminishing stepsizes. For a decaying stepsize rule $α_{k} = Θ (1/ k^{s})$ with $s \in (0, 1]$ , we show that the distance of the IG iterates to the optimal solution…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.