Magnitude Matters: Fixing SIGNSGD Through Magnitude-Aware Sparsification   in the Presence of Data Heterogeneity

Richeng Jin; Xiaofan He; Caijun Zhong; Zhaoyang Zhang; Tony Quek,; Huaiyu Dai

arXiv:2302.09634·cs.LG·February 21, 2023

Magnitude Matters: Fixing SIGNSGD Through Magnitude-Aware Sparsification in the Presence of Data Heterogeneity

Richeng Jin, Xiaofan He, Caijun Zhong, Zhaoyang Zhang, Tony Quek,, Huaiyu Dai

PDF

Open Access

TL;DR

This paper introduces a magnitude-aware sparsification method for SIGNSGD that addresses convergence issues caused by data heterogeneity in federated learning, improving communication efficiency without requiring error feedback.

Contribution

The paper proposes a novel magnitude-driven sparsification scheme for SIGNSGD that ensures convergence under data heterogeneity and enhances communication efficiency in federated learning.

Findings

01

Convergence is achieved with the proposed sparsification scheme.

02

Communication overhead is reduced compared to traditional methods.

03

Experimental results validate improved performance on multiple datasets.

Abstract

Communication overhead has become one of the major bottlenecks in the distributed training of deep neural networks. To alleviate the concern, various gradient compression methods have been proposed, and sign-based algorithms are of surging interest. However, SIGNSGD fails to converge in the presence of data heterogeneity, which is commonly observed in the emerging federated learning (FL) paradigm. Error feedback has been proposed to address the non-convergence issue. Nonetheless, it requires the workers to locally keep track of the compression errors, which renders it not suitable for FL since the workers may not participate in the training throughout the learning process. In this paper, we propose a magnitude-driven sparsification scheme, which addresses the non-convergence issue of SIGNSGD while further improving communication efficiency. Moreover, the local update scheme is further…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Machine Learning and ELM · Stochastic Gradient Optimization Techniques