Improving the Transient Times for Distributed Stochastic Gradient   Methods

Kun Huang; Shi Pu

arXiv:2105.04851·math.OC·May 12, 2021·5 cites

Improving the Transient Times for Distributed Stochastic Gradient Methods

Kun Huang, Shi Pu

PDF

Open Access

TL;DR

This paper introduces EDAS, a distributed stochastic gradient algorithm that significantly reduces transient time to reach optimal convergence rates in networked optimization, matching centralized SGD performance.

Contribution

The paper proposes EDAS, a novel adaptive stepsize method for distributed stochastic gradient descent that achieves minimal transient time and optimal convergence rates.

Findings

01

EDAS attains the same asymptotic convergence rate as centralized SGD.

02

Transient time for EDAS is proportional to n/(1-λ₂), optimizing performance.

03

Numerical results confirm theoretical transient time and convergence rate improvements.

Abstract

We consider the distributed optimization problem where $n$ agents each possessing a local cost function, collaboratively minimize the average of the $n$ cost functions over a connected network. Assuming stochastic gradient information is available, we study a distributed stochastic gradient algorithm, called exact diffusion with adaptive stepsizes (EDAS) adapted from the Exact Diffusion method and NIDS and perform a non-asymptotic convergence analysis. We not only show that EDAS asymptotically achieves the same network independent convergence rate as centralized stochastic gradient descent (SGD) for minimizing strongly convex and smooth objective functions, but also characterize the transient time needed for the algorithm to approach the asymptotic convergence rate, which behaves as $K_{T} = O (\frac{n}{1 - λ _{2}})$ , where $1 - λ_{2}$ stands for the spectral gap of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDistributed Control Multi-Agent Systems · Stochastic Gradient Optimization Techniques · Neural Networks Stability and Synchronization

MethodsDiffusion