A Unified Analysis of Stochastic Gradient Methods for Nonconvex   Federated Optimization

Zhize Li; Peter Richt\'arik

arXiv:2006.07013·math.OC·June 15, 2020·24 cites

A Unified Analysis of Stochastic Gradient Methods for Nonconvex Federated Optimization

Zhize Li, Peter Richt\'arik

PDF

Open Access

TL;DR

This paper provides a unified theoretical framework for analyzing various stochastic gradient descent methods in nonconvex federated optimization, covering both classical and modern variants with compressed communication.

Contribution

It introduces a flexible assumption that models the second moment of stochastic gradients, unifying convergence analysis across multiple SGD variants and distributed algorithms.

Findings

01

Unified convergence analysis for many SGD variants.

02

Improved convergence results for classical methods.

03

New convergence guarantees for distributed compressed methods.

Abstract

In this paper, we study the performance of a large family of SGD variants in the smooth nonconvex regime. To this end, we propose a generic and flexible assumption capable of accurate modeling of the second moment of the stochastic gradient. Our assumption is satisfied by a large number of specific variants of SGD in the literature, including SGD with arbitrary sampling, SGD with compressed gradients, and a wide variety of variance-reduced SGD methods such as SVRG and SAGA. We provide a single convergence analysis for all methods that satisfy the proposed unified assumption, thereby offering a unified understanding of SGD variants in the nonconvex regime instead of relying on dedicated analyses of each variant. Moreover, our unified analysis is accurate enough to recover or improve upon the best-known convergence results of several classical methods, and also gives new convergence…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Privacy-Preserving Technologies in Data

MethodsSAGA · Stochastic Gradient Descent