Layer-wise and Dimension-wise Locally Adaptive Federated Learning

Belhal Karimi; Ping Li; Xiaoyun Li

arXiv:2110.00532·cs.LG·June 24, 2022·1 cites

Layer-wise and Dimension-wise Locally Adaptive Federated Learning

Belhal Karimi, Ping Li, Xiaoyun Li

PDF

Open Access

TL;DR

This paper introduces a novel federated learning framework with layer-wise adaptivity, improving convergence speed and generalization in training deep neural networks across diverse data distributions.

Contribution

It proposes layer-wise adaptive FL algorithms, Fed-LAMB and Mime-LAMB, with theoretical convergence guarantees and superior empirical performance.

Findings

01

Faster convergence compared to recent adaptive FL methods

02

Better generalization performance on various datasets

03

Linear speedup with the number of workers

Abstract

In the emerging paradigm of Federated Learning (FL), large amount of clients such as mobile devices are used to train possibly high-dimensional models on their respective data. Combining (dimension-wise) adaptive gradient methods (e.g. Adam, AMSGrad) with FL has been an active direction, which is shown to outperform traditional SGD based FL in many cases. In this paper, we focus on the problem of training federated deep neural networks, and propose a novel FL framework which further introduces layer-wise adaptivity to the local model updates. Our framework can be applied to locally adaptive FL methods including two recent algorithms, Mime and Fed-AMS. Theoretically, we provide a convergence analysis of our layer-wise FL methods, coined Fed-LAMB and Mime-LAMB, which matches the convergence rate of state-of-the-art results in FL and exhibits linear speedup in terms of the number of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Recommender Systems and Techniques · Human Mobility and Location-Based Analysis

MethodsStochastic Gradient Descent · Adam