FedCGD: Collective Gradient Divergence Optimized Scheduling for Wireless Federated Learning
Tan Chen, Jintao Yan, Yuxuan Sun, Sheng Zhou, Zhisheng Niu

TL;DR
This paper introduces FedCGD, a novel device scheduling algorithm for wireless federated learning that optimizes collective gradient divergence to improve convergence speed and classification accuracy while reducing device participation.
Contribution
It proves the impact of collective gradient divergence on FL convergence, models it using weighted earth moving distance, and proposes an algorithm to balance divergence and sampling variance.
Findings
Increases CIFAR-10 accuracy by up to 4.2%.
Schedules 41.8% fewer devices.
Flexibly balances WEMD reduction and sampling variance.
Abstract
Federated learning (FL) is a promising paradigm for multiple devices to cooperatively train a model. When applied in wireless networks, two issues consistently affect the performance of FL, i.e., data heterogeneity of devices and limited bandwidth. Many papers have investigated device scheduling strategies considering the two issues. However, most of them recognize data heterogeneity as a property of individual devices. In this paper, we prove that the convergence speed of FL is affected by the sum of device-level and sample-level collective gradient divergence (CGD). The device-level CGD refers to the gradient divergence of the scheduled device group, instead of the sum of the individual device divergence. The sample-level CGD is statistically upper bounded by sampling variance, which is inversely proportional to the total number of samples scheduled for local update. To derive a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Advanced Data and IoT Technologies · Stochastic Gradient Optimization Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
