Distributed Stochastic Optimization under a General Variance Condition

Kun Huang; Xiao Li; Shi Pu

arXiv:2301.12677·math.OC·December 15, 2023

Distributed Stochastic Optimization under a General Variance Condition

Kun Huang, Xiao Li, Shi Pu

PDF

Open Access

TL;DR

This paper establishes convergence guarantees for distributed stochastic optimization algorithms like FedAvg and SCAFFOLD under a mild variance condition, addressing data heterogeneity and extending theoretical understanding.

Contribution

It provides the first convergence analysis of these algorithms under a general variance condition, relaxing previous boundedness assumptions.

Findings

01

Convergence to stationary points under mild variance conditions

02

Almost sure convergence established for nonconvex objectives

03

Implications of data heterogeneity measurement discussed

Abstract

Distributed stochastic optimization has drawn great attention recently due to its effectiveness in solving large-scale machine learning problems. Though numerous algorithms have been proposed and successfully applied to general practical problems, their theoretical guarantees mainly rely on certain boundedness conditions on the stochastic gradients, varying from uniform boundedness to the relaxed growth condition. In addition, how to characterize the data heterogeneity among the agents and its impacts on the algorithmic performance remains challenging. In light of such motivations, we revisit the classical Federated Averaging (FedAvg) algorithm (McMahan et al., 2017) as well as the more recent SCAFFOLD method (Karimireddy et al., 2020) for solving the distributed stochastic optimization problem and establish the convergence results under only a mild variance condition on the stochastic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Distributed Control Multi-Agent Systems · Machine Learning and ELM