$\gamma$-FedHT: Stepsize-Aware Hard-Threshold Gradient Compression in Federated Learning

Rongwei Lu; Yutong Jiang; Jinrui Zhang; Chunyang Li; Yifei Zhu; Bin Chen; Zhi Wang

arXiv:2505.12479·cs.LG·May 20, 2025

$\gamma$-FedHT: Stepsize-Aware Hard-Threshold Gradient Compression in Federated Learning

Rongwei Lu, Yutong Jiang, Jinrui Zhang, Chunyang Li, Yifei Zhu, Bin Chen, Zhi Wang

PDF

Open Access

TL;DR

This paper introduces $oldsymbol{ ext{ extgamma}- ext{FedHT}}$, a stepsize-aware gradient compression method with Error-Feedback for federated learning, improving accuracy and convergence under communication constraints and non-IID data.

Contribution

It proposes a novel stepsize-aware hard-threshold compressor with Error-Feedback, providing convergence guarantees and better accuracy in federated learning with non-IID datasets.

Findings

01

Achieves up to 7.42% accuracy improvement over Top-$k$ compression.

02

Maintains convergence rates comparable to FedAvg under convex and non-convex settings.

03

Effectively balances communication efficiency and model accuracy in non-IID federated learning.

Abstract

Gradient compression can effectively alleviate communication bottlenecks in Federated Learning (FL). Contemporary state-of-the-art sparse compressors, such as Top- $k$ , exhibit high computational complexity, up to $O (d lo g_{2} k)$ , where $d$ is the number of model parameters. The hard-threshold compressor, which simply transmits elements with absolute values higher than a fixed threshold, is thus proposed to reduce the complexity to $O (d)$ . However, the hard-threshold compression causes accuracy degradation in FL, where the datasets are non-IID and the stepsize $γ$ is decreasing for model convergence. The decaying stepsize reduces the updates and causes the compression ratio of the hard-threshold compression to drop rapidly to an aggressive ratio. At or below this ratio, the model accuracy has been observed to degrade severely. To address this, we propose…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Advanced Neural Network Applications