Delayed Random Partial Gradient Averaging for Federated Learning

Xinyi Hu

arXiv:2412.19987·cs.LG·December 31, 2024

Delayed Random Partial Gradient Averaging for Federated Learning

Xinyi Hu

PDF

Open Access

TL;DR

This paper introduces DPGA, a federated learning method that reduces communication costs and latency by sharing partial gradients and enabling parallel computation, improving scalability and efficiency.

Contribution

The paper proposes a novel DPGA method that shares partial gradients and refines update rates over time to enhance federated learning scalability and efficiency.

Findings

01

DPGA reduces communication overhead in federated learning.

02

DPGA enables parallel computation to decrease system run time.

03

Experimental results on CIFAR datasets validate DPGA's effectiveness.

Abstract

Federated learning (FL) is a distributed machine learning paradigm that enables multiple clients to train a shared model collaboratively while preserving privacy. However, the scaling of real-world FL systems is often limited by two communication bottlenecks:(a) while the increasing computing power of edge devices enables the deployment of large-scale Deep Neural Networks (DNNs), the limited bandwidth constraints frequent transmissions over large DNNs; and (b) high latency cost greatly degrades the performance of FL. In light of these bottlenecks, we propose a Delayed Random Partial Gradient Averaging (DPGA) to enhance FL. Under DPGA, clients only share partial local model gradients with the server. The size of the shared part in a local model is determined by the update rate, which is coarsely initialized and subsequently refined over the temporal dimension. Moreover, DPGA largely…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data · Random Matrices and Applications