Distributed Stochastic Optimization under a General Variance Condition
Kun Huang, Xiao Li, Shi Pu

TL;DR
This paper establishes convergence guarantees for distributed stochastic optimization algorithms like FedAvg and SCAFFOLD under a mild variance condition, addressing data heterogeneity and extending theoretical understanding.
Contribution
It provides the first convergence analysis of these algorithms under a general variance condition, relaxing previous boundedness assumptions.
Findings
Convergence to stationary points under mild variance conditions
Almost sure convergence established for nonconvex objectives
Implications of data heterogeneity measurement discussed
Abstract
Distributed stochastic optimization has drawn great attention recently due to its effectiveness in solving large-scale machine learning problems. Though numerous algorithms have been proposed and successfully applied to general practical problems, their theoretical guarantees mainly rely on certain boundedness conditions on the stochastic gradients, varying from uniform boundedness to the relaxed growth condition. In addition, how to characterize the data heterogeneity among the agents and its impacts on the algorithmic performance remains challenging. In light of such motivations, we revisit the classical Federated Averaging (FedAvg) algorithm (McMahan et al., 2017) as well as the more recent SCAFFOLD method (Karimireddy et al., 2020) for solving the distributed stochastic optimization problem and establish the convergence results under only a mild variance condition on the stochastic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Distributed Control Multi-Agent Systems · Machine Learning and ELM
