Distributed Learning With Sparsified Gradient Differences

Yicheng Chen; Rick S. Blum; Martin Takac; and Brian M. Sadler

arXiv:2202.02491·cs.LG·February 8, 2022

Distributed Learning With Sparsified Gradient Differences

Yicheng Chen, Rick S. Blum, Martin Takac, and Brian M. Sadler

PDF

Open Access

TL;DR

GD-SEC is a novel distributed learning algorithm that reduces communication costs by transmitting sparsified gradient differences with error correction, maintaining convergence speed and accuracy across various optimization problems.

Contribution

This paper introduces GD-SEC, a new method that significantly decreases communication in distributed learning without sacrificing convergence or accuracy.

Findings

01

GD-SEC achieves similar convergence rates as standard gradient descent.

02

GD-SEC reduces communication bits significantly compared to existing algorithms.

03

Numerical experiments validate the effectiveness and efficiency of GD-SEC.

Abstract

A very large number of communications are typically required to solve distributed learning tasks, and this critically limits scalability and convergence speed in wireless communications applications. In this paper, we devise a Gradient Descent method with Sparsification and Error Correction (GD-SEC) to improve the communications efficiency in a general worker-server architecture. Motivated by a variety of wireless communications learning scenarios, GD-SEC reduces the number of bits per communication from worker to server with no degradation in the order of the convergence rate. This enables larger-scale model learning without sacrificing convergence or accuracy. At each iteration of GD-SEC, instead of directly transmitting the entire gradient vector, each worker computes the difference between its current gradient and a linear combination of its previously transmitted gradients, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCooperative Communication and Network Coding · Energy Harvesting in Wireless Networks · Advanced MIMO Systems Optimization

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings