Towards Communication-efficient Federated Learning via Sparse and Aligned Adaptive Optimization

Xiumei Deng; Jun Li; Kang Wei; Long Shi; Zehui Xiong; Ming Ding; Wen Chen; Shi Jin; and H. Vincent Poor

arXiv:2405.17932·cs.LG·September 22, 2025·1 cites

Towards Communication-efficient Federated Learning via Sparse and Aligned Adaptive Optimization

Xiumei Deng, Jun Li, Kang Wei, Long Shi, Zehui Xiong, Ming Ding, Wen Chen, Shi Jin, and H. Vincent Poor

PDF

Open Access

TL;DR

This paper introduces FedAdam-SSM, a communication-efficient federated learning algorithm that sparsifies updates with a shared mask, reducing communication overhead while maintaining convergence and accuracy.

Contribution

The paper proposes FedAdam-SSM, a novel sparsification method with a shared sparse mask to reduce communication in federated Adam without sacrificing convergence.

Findings

01

FedAdam-SSM converges faster than baseline methods.

02

FedAdam-SSM achieves over 14.5% higher test accuracy.

03

FedAdam-SSM reduces communication overhead significantly.

Abstract

Adaptive moment estimation (Adam), as a Stochastic Gradient Descent (SGD) variant, has gained widespread popularity in federated learning (FL) due to its fast convergence. However, federated Adam (FedAdam) algorithms suffer from a threefold increase in uplink communication overhead compared to federated SGD (FedSGD) algorithms, which arises from the necessity to transmit both local model updates and first and second moment estimates from distributed devices to the centralized server for aggregation. Driven by this issue, we propose a novel sparse FedAdam algorithm called FedAdam-SSM, wherein distributed devices sparsify the updates of local model parameters and moment estimates and subsequently upload the sparse representations to the centralized server. To further reduce the communication overhead, the updates of local model parameters and moment estimates incorporate a shared sparse…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · IoT and Edge/Fog Computing

MethodsStochastic Gradient Descent · Adam