$\gamma$-FedHT: Stepsize-Aware Hard-Threshold Gradient Compression in Federated Learning
Rongwei Lu, Yutong Jiang, Jinrui Zhang, Chunyang Li, Yifei Zhu, Bin Chen, Zhi Wang

TL;DR
This paper introduces $oldsymbol{ ext{ extgamma}- ext{FedHT}}$, a stepsize-aware gradient compression method with Error-Feedback for federated learning, improving accuracy and convergence under communication constraints and non-IID data.
Contribution
It proposes a novel stepsize-aware hard-threshold compressor with Error-Feedback, providing convergence guarantees and better accuracy in federated learning with non-IID datasets.
Findings
Achieves up to 7.42% accuracy improvement over Top-$k$ compression.
Maintains convergence rates comparable to FedAvg under convex and non-convex settings.
Effectively balances communication efficiency and model accuracy in non-IID federated learning.
Abstract
Gradient compression can effectively alleviate communication bottlenecks in Federated Learning (FL). Contemporary state-of-the-art sparse compressors, such as Top-, exhibit high computational complexity, up to , where is the number of model parameters. The hard-threshold compressor, which simply transmits elements with absolute values higher than a fixed threshold, is thus proposed to reduce the complexity to . However, the hard-threshold compression causes accuracy degradation in FL, where the datasets are non-IID and the stepsize is decreasing for model convergence. The decaying stepsize reduces the updates and causes the compression ratio of the hard-threshold compression to drop rapidly to an aggressive ratio. At or below this ratio, the model accuracy has been observed to degrade severely. To address this, we propose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Advanced Neural Network Applications
