Revisiting LocalSGD and SCAFFOLD: Improved Rates and Missing Analysis
Ruichen Luo, Sebastian U Stich, Samuel Horv\'ath, Martin Tak\'a\v{c}

TL;DR
This paper provides a comprehensive theoretical analysis of LocalSGD and SCAFFOLD, demonstrating their improved convergence rates over minibatch SGD under weaker and more realistic conditions in distributed optimization.
Contribution
It offers new convergence guarantees for LocalSGD and SCAFFOLD under weaker assumptions, clarifying when these methods outperform simpler stochastic gradient methods.
Findings
LocalSGD converges faster than MbSGD for weakly convex functions.
Higher-order similarity and smoothness enhance LocalSGD performance.
SCAFFOLD outperforms MbSGD for a broader class of non-quadratic functions.
Abstract
LocalSGD and SCAFFOLD are widely used methods in distributed stochastic optimization, with numerous applications in machine learning, large-scale data processing, and federated learning. However, rigorously establishing their theoretical advantages over simpler methods, such as minibatch SGD (MbSGD), has proven challenging, as existing analyses often rely on strong assumptions, unrealistic premises, or overly restrictive scenarios. In this work, we revisit the convergence properties of LocalSGD and SCAFFOLD under a variety of existing or weaker conditions, including gradient similarity, Hessian similarity, weak convexity, and Lipschitz continuity of the Hessian. Our analysis shows that (i) LocalSGD achieves faster convergence compared to MbSGD for weakly convex functions without requiring stronger gradient similarity assumptions; (ii) LocalSGD benefits significantly from higher-order…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRural development and sustainability
MethodsStochastic Gradient Descent
