Convergence rates for distributed stochastic optimization over random   networks

Dusan Jakovetic; Dragana Bajovic; Anit Kumar Sahu; Soummya Kar

arXiv:1803.07836·math.OC·March 22, 2018·CDC

Convergence rates for distributed stochastic optimization over random networks

Dusan Jakovetic, Dragana Bajovic, Anit Kumar Sahu, Soummya Kar

PDF

TL;DR

This paper proves that distributed stochastic gradient methods over random networks with strongly convex costs achieve an optimal O(1/k) convergence rate, even with unbounded gradient noise, advancing understanding of distributed optimization.

Contribution

It establishes the first order-optimal convergence rate for distributed stochastic optimization over random networks with unbounded gradient noise.

Findings

01

Achieves O(1/k) convergence rate in mean square distance.

02

Validates theoretical results with simulation examples.

03

Handles unbounded gradient noise in distributed settings.

Abstract

We establish the O( $\frac{1}{k}$ ) convergence rate for distributed stochastic gradient methods that operate over strongly convex costs and random networks. The considered class of methods is standard each node performs a weighted average of its own and its neighbors solution estimates (consensus), and takes a negative step with respect to a noisy version of its local functions gradient (innovation). The underlying communication network is modeled through a sequence of temporally independent identically distributed (i.i.d.) Laplacian matrices connected on average, while the local gradient noises are also i.i.d. in time, have finite second moment, and possibly unbounded support. We show that, after a careful setting of the consensus and innovations potentials (weights), the distributed stochastic gradient method achieves a (order-optimal) O( $\frac{1}{k}$ ) convergence rate in the mean…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.