Local Adaptivity in Federated Learning: Convergence and Consistency

Jianyu Wang; Zheng Xu; Zachary Garrett; Zachary Charles; Luyang Liu,; Gauri Joshi

arXiv:2106.02305·cs.LG·June 7, 2021·21 cites

Local Adaptivity in Federated Learning: Convergence and Consistency

Jianyu Wang, Zheng Xu, Zachary Garrett, Zachary Charles, Luyang Liu,, Gauri Joshi

PDF

Open Access

TL;DR

This paper investigates the use of adaptive optimization methods for local updates in federated learning, revealing their benefits and drawbacks, and proposes correction techniques to improve convergence and accuracy.

Contribution

It introduces correction techniques for local adaptive methods in federated learning, addressing convergence bias and enhancing training efficiency.

Findings

01

Local adaptive methods can accelerate convergence.

02

Proposed corrections mitigate solution bias.

03

Algorithms achieve faster convergence and higher accuracy.

Abstract

The federated learning (FL) framework trains a machine learning model using decentralized data stored at edge client devices by periodically aggregating locally trained models. Popular optimization algorithms of FL use vanilla (stochastic) gradient descent for both local updates at clients and global updates at the aggregating server. Recently, adaptive optimization methods such as AdaGrad have been studied for server updates. However, the effect of using adaptive optimization methods for local updates at clients is not yet understood. We show in both theory and practice that while local adaptive methods can accelerate convergence, they can cause a non-vanishing solution bias, where the final converged solution may be different from the stationary point of the global objective function. We propose correction techniques to overcome this inconsistency and complement the local adaptive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Random Matrices and Applications

MethodsAdaGrad