Layer-wise and Dimension-wise Locally Adaptive Federated Learning
Belhal Karimi, Ping Li, Xiaoyun Li

TL;DR
This paper introduces a novel federated learning framework with layer-wise adaptivity, improving convergence speed and generalization in training deep neural networks across diverse data distributions.
Contribution
It proposes layer-wise adaptive FL algorithms, Fed-LAMB and Mime-LAMB, with theoretical convergence guarantees and superior empirical performance.
Findings
Faster convergence compared to recent adaptive FL methods
Better generalization performance on various datasets
Linear speedup with the number of workers
Abstract
In the emerging paradigm of Federated Learning (FL), large amount of clients such as mobile devices are used to train possibly high-dimensional models on their respective data. Combining (dimension-wise) adaptive gradient methods (e.g. Adam, AMSGrad) with FL has been an active direction, which is shown to outperform traditional SGD based FL in many cases. In this paper, we focus on the problem of training federated deep neural networks, and propose a novel FL framework which further introduces layer-wise adaptivity to the local model updates. Our framework can be applied to locally adaptive FL methods including two recent algorithms, Mime and Fed-AMS. Theoretically, we provide a convergence analysis of our layer-wise FL methods, coined Fed-LAMB and Mime-LAMB, which matches the convergence rate of state-of-the-art results in FL and exhibits linear speedup in terms of the number of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Recommender Systems and Techniques · Human Mobility and Location-Based Analysis
MethodsStochastic Gradient Descent · Adam
