Harnessing Smoothness to Accelerate Distributed Optimization

Guannan Qu; Na Li

arXiv:1605.07112·math.OC·May 2, 2017·IEEE Trans. Control. Netw. Syst.

Harnessing Smoothness to Accelerate Distributed Optimization

Guannan Qu, Na Li

PDF

TL;DR

This paper introduces a distributed optimization algorithm that leverages function smoothness and history information to achieve faster convergence rates, matching centralized methods, even in the presence of strong convexity.

Contribution

The paper presents a novel distributed algorithm that effectively harnesses smoothness and history information to accelerate convergence, achieving rates comparable to centralized gradient descent.

Findings

01

Achieves $O(1/t)$ convergence rate for convex smooth functions.

02

Attains linear convergence for strongly convex functions.

03

Demonstrates the necessity of history information for fast convergence.

Abstract

There has been a growing effort in studying the distributed optimization problem over a network. The objective is to optimize a global function formed by a sum of local functions, using only local computation and communication. Literature has developed consensus-based distributed (sub)gradient descent (DGD) methods and has shown that they have the same convergence rate $O (\frac{l o g t}{t})$ as the centralized (sub)gradient methods (CGD) when the function is convex but possibly nonsmooth. However, when the function is convex and smooth, under the framework of DGD, it is unclear how to harness the smoothness to obtain a faster convergence rate comparable to CGD's convergence rate. In this paper, we propose a distributed algorithm that, despite using the same amount of communication per iteration as DGD, can effectively harnesses the function smoothness and converge to the optimum…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.