First Analysis of Local GD on Heterogeneous Data

Ahmed Khaled; Konstantin Mishchenko; Peter Richt\'arik

arXiv:1909.04715·cs.LG·March 19, 2020·67 cites

First Analysis of Local GD on Heterogeneous Data

Ahmed Khaled, Konstantin Mishchenko, Peter Richt\'arik

PDF

Open Access

TL;DR

This paper presents the first convergence analysis of local gradient descent for heterogeneous data in federated learning, showing it matches gradient descent's communication complexity in low accuracy regimes.

Contribution

It provides the first theoretical analysis of local gradient descent on arbitrary convex functions with heterogeneous data, relevant for federated learning.

Findings

01

In low accuracy regimes, local gradient descent has the same communication complexity as standard gradient descent.

02

The analysis applies to smooth, convex, but otherwise arbitrary functions.

03

This work advances understanding of optimization methods in federated learning with heterogeneous data.

Abstract

We provide the first convergence analysis of local gradient descent for minimizing the average of smooth and convex but otherwise arbitrary functions. Problems of this form and local gradient descent as a solution method are of importance in federated learning, where each function is based on private data stored by a user on a mobile device, and the data of different users can be arbitrarily heterogeneous. We show that in a low accuracy regime, the method has the same communication complexity as gradient descent.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data · Sparse and Compressive Sensing Techniques