First Analysis of Local GD on Heterogeneous Data
Ahmed Khaled, Konstantin Mishchenko, Peter Richt\'arik

TL;DR
This paper presents the first convergence analysis of local gradient descent for heterogeneous data in federated learning, showing it matches gradient descent's communication complexity in low accuracy regimes.
Contribution
It provides the first theoretical analysis of local gradient descent on arbitrary convex functions with heterogeneous data, relevant for federated learning.
Findings
In low accuracy regimes, local gradient descent has the same communication complexity as standard gradient descent.
The analysis applies to smooth, convex, but otherwise arbitrary functions.
This work advances understanding of optimization methods in federated learning with heterogeneous data.
Abstract
We provide the first convergence analysis of local gradient descent for minimizing the average of smooth and convex but otherwise arbitrary functions. Problems of this form and local gradient descent as a solution method are of importance in federated learning, where each function is based on private data stored by a user on a mobile device, and the data of different users can be arbitrarily heterogeneous. We show that in a low accuracy regime, the method has the same communication complexity as gradient descent.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data · Sparse and Compressive Sensing Techniques
