Temporal Predictive Coding for Gradient Compression in Distributed   Learning

Adrian Edin; Zheng Chen; Michel Kieffer; and Mikael Johansson

arXiv:2410.02478·cs.IT·October 4, 2024

Temporal Predictive Coding for Gradient Compression in Distributed Learning

Adrian Edin, Zheng Chen, Michel Kieffer, and Mikael Johansson

PDF

Open Access

TL;DR

This paper introduces a prediction-based gradient compression method for distributed learning that leverages temporal correlation in gradients to reduce communication costs while maintaining convergence.

Contribution

It proposes a novel linear predictor-based compression technique with event-triggered communication, optimizing gradient transmission in distributed learning.

Findings

01

Achieves significant reduction in communication without sacrificing convergence.

02

Outperforms existing gradient compression methods in experiments.

03

Maintains model accuracy with less data transmitted.

Abstract

This paper proposes a prediction-based gradient compression method for distributed learning with event-triggered communication. Our goal is to reduce the amount of information transmitted from the distributed agents to the parameter server by exploiting temporal correlation in the local gradients. We use a linear predictor that \textit{combines past gradients to form a prediction of the current gradient}, with coefficients that are optimized by solving a least-square problem. In each iteration, every agent transmits the predictor coefficients to the server such that the predicted local gradient can be computed. The difference between the true local gradient and the predicted one, termed the \textit{prediction residual, is only transmitted when its norm is above some threshold.} When this additional communication step is omitted, the server uses the prediction as the estimated gradient.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Data Compression Techniques · Neural Networks and Applications