FedCGD: Collective Gradient Divergence Optimized Scheduling for Wireless Federated Learning

Tan Chen; Jintao Yan; Yuxuan Sun; Sheng Zhou; Zhisheng Niu

arXiv:2506.07581·cs.LG·June 10, 2025

FedCGD: Collective Gradient Divergence Optimized Scheduling for Wireless Federated Learning

Tan Chen, Jintao Yan, Yuxuan Sun, Sheng Zhou, Zhisheng Niu

PDF

Open Access

TL;DR

This paper introduces FedCGD, a novel device scheduling algorithm for wireless federated learning that optimizes collective gradient divergence to improve convergence speed and classification accuracy while reducing device participation.

Contribution

It proves the impact of collective gradient divergence on FL convergence, models it using weighted earth moving distance, and proposes an algorithm to balance divergence and sampling variance.

Findings

01

Increases CIFAR-10 accuracy by up to 4.2%.

02

Schedules 41.8% fewer devices.

03

Flexibly balances WEMD reduction and sampling variance.

Abstract

Federated learning (FL) is a promising paradigm for multiple devices to cooperatively train a model. When applied in wireless networks, two issues consistently affect the performance of FL, i.e., data heterogeneity of devices and limited bandwidth. Many papers have investigated device scheduling strategies considering the two issues. However, most of them recognize data heterogeneity as a property of individual devices. In this paper, we prove that the convergence speed of FL is affected by the sum of device-level and sample-level collective gradient divergence (CGD). The device-level CGD refers to the gradient divergence of the scheduled device group, instead of the sum of the individual device divergence. The sample-level CGD is statistically upper bounded by sampling variance, which is inversely proportional to the total number of samples scheduled for local update. To derive a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Advanced Data and IoT Technologies · Stochastic Gradient Optimization Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings