Coordinating Momenta for Cross-silo Federated Learning
An Xu, Heng Huang

TL;DR
This paper introduces a novel momentum fusion technique with double buffers to enhance training performance in cross-silo federated learning, addressing client drift caused by non-i.i.d. data distributions.
Contribution
The paper proposes a new method using double momentum buffers and a momentum fusion technique, with theoretical convergence analysis for cross-silo federated learning.
Findings
Outperforms FedAvg and existing momentum SGD variants in experiments
Improves training stability and convergence in non-i.i.d. data scenarios
Provides the first theoretical convergence analysis involving server and local momentum SGD
Abstract
Communication efficiency is crucial for federated learning (FL). Conducting local training steps in clients to reduce the communication frequency between clients and the server is a common method to address this issue. However, this strategy leads to the client drift problem due to \textit{non-i.i.d.} data distributions in different clients which severely deteriorates the performance. In this work, we propose a new method to improve the training performance in cross-silo FL via maintaining double momentum buffers. In our algorithm, one momentum buffer is used to track the server model updating direction, and the other one is adopted to track the local model updating direction. More important, we introduce a novel momentum fusion technique to coordinate the server and local momentum buffers. We also derive the first theoretical convergence analysis involving both the server and local…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Traffic Prediction and Management Techniques
MethodsStochastic Gradient Descent
