A Unified and Refined Convergence Analysis for Non-Convex Decentralized   Learning

Sulaiman A. Alghunaim; Kun Yuan

arXiv:2110.09993·cs.DC·July 20, 2022

A Unified and Refined Convergence Analysis for Non-Convex Decentralized Learning

Sulaiman A. Alghunaim, Kun Yuan

PDF

TL;DR

This paper provides a unified convergence analysis for decentralized non-convex optimization algorithms, showing they are less sensitive to network topology than previously thought, aligning theory with empirical observations.

Contribution

The authors introduce a general stochastic unified decentralized algorithm (SUDA) and establish improved convergence bounds that better reflect the algorithms' robustness to network topology.

Findings

01

SUDA converges under non-convex and Polyak-Lojasiewicz conditions.

02

Decentralized methods like Exact-Diffusion are less sensitive to network topology than DSGD.

03

Results align with empirical experiments showing robustness of certain algorithms.

Abstract

We study the consensus decentralized optimization problem where the objective function is the average of $n$ agents private non-convex cost functions; moreover, the agents can only communicate to their neighbors on a given network topology. The stochastic learning setting is considered in this paper where each agent can only access a noisy estimate of its gradient. Many decentralized methods can solve such problem including EXTRA, Exact-Diffusion/D $^{2}$ , and gradient-tracking. Unlike the famed DSGD algorithm, these methods have been shown to be robust to the heterogeneity across the local cost functions. However, the established convergence rates for these methods indicate that their sensitivity to the network topology is worse than DSGD. Such theoretical results imply that these methods can perform much worse than DSGD over sparse networks, which, however, contradicts empirical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.