High-Dimensional Stochastic Gradient Quantization for   Communication-Efficient Edge Learning

Yuqing Du; Sheng Yang; Kaibin Huang

arXiv:1910.03865·cs.IT·June 24, 2020

High-Dimensional Stochastic Gradient Quantization for Communication-Efficient Edge Learning

Yuqing Du, Sheng Yang, Kaibin Huang

PDF

TL;DR

This paper introduces a hierarchical stochastic gradient quantization framework for federated edge learning, significantly reducing communication overhead while maintaining model accuracy through innovative quantization and bit-allocation strategies.

Contribution

It proposes a novel hierarchical quantization scheme with convergence guarantees and an efficient bit-allocation method, improving communication efficiency in high-dimensional federated learning.

Findings

01

Reduces communication overhead compared to signSGD.

02

Maintains similar learning accuracy with fewer bits.

03

Provides convergence analysis of the quantization scheme.

Abstract

Edge machine learning involves the deployment of learning algorithms at the wireless network edge so as to leverage massive mobile data for enabling intelligent applications. The mainstream edge learning approach, federated learning, has been developed based on distributed gradient descent. Based on the approach, stochastic gradients are computed at edge devices and then transmitted to an edge server for updating a global AI model. Since each stochastic gradient is typically high-dimensional (with millions to billions of coefficients), communication overhead becomes a bottleneck for edge learning. To address this issue, we propose in this work a novel framework of hierarchical stochastic gradient quantization and study its effect on the learning performance. First, the framework features a practical hierarchical architecture for decomposing the stochastic gradient into its norm and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.