Towards Communication-efficient Federated Learning via Sparse and Aligned Adaptive Optimization
Xiumei Deng, Jun Li, Kang Wei, Long Shi, Zehui Xiong, Ming Ding, Wen Chen, Shi Jin, and H. Vincent Poor

TL;DR
This paper introduces FedAdam-SSM, a communication-efficient federated learning algorithm that sparsifies updates with a shared mask, reducing communication overhead while maintaining convergence and accuracy.
Contribution
The paper proposes FedAdam-SSM, a novel sparsification method with a shared sparse mask to reduce communication in federated Adam without sacrificing convergence.
Findings
FedAdam-SSM converges faster than baseline methods.
FedAdam-SSM achieves over 14.5% higher test accuracy.
FedAdam-SSM reduces communication overhead significantly.
Abstract
Adaptive moment estimation (Adam), as a Stochastic Gradient Descent (SGD) variant, has gained widespread popularity in federated learning (FL) due to its fast convergence. However, federated Adam (FedAdam) algorithms suffer from a threefold increase in uplink communication overhead compared to federated SGD (FedSGD) algorithms, which arises from the necessity to transmit both local model updates and first and second moment estimates from distributed devices to the centralized server for aggregation. Driven by this issue, we propose a novel sparse FedAdam algorithm called FedAdam-SSM, wherein distributed devices sparsify the updates of local model parameters and moment estimates and subsequently upload the sparse representations to the centralized server. To further reduce the communication overhead, the updates of local model parameters and moment estimates incorporate a shared sparse…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · IoT and Edge/Fog Computing
MethodsStochastic Gradient Descent · Adam
