Preserved central model for faster bidirectional compression in   distributed settings

Constantin Philippenko; Aymeric Dieuleveut

arXiv:2102.12528·cs.LG·June 17, 2022

Preserved central model for faster bidirectional compression in distributed settings

Constantin Philippenko, Aymeric Dieuleveut

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper introduces MCM, a novel distributed learning algorithm that enables bidirectional compression with preserved global models, achieving efficient communication without sacrificing convergence speed.

Contribution

The paper presents MCM, a new algorithm that allows downlink compression to only affect local models, maintaining global model integrity and convergence rate in distributed learning.

Findings

01

Achieves same convergence rate as uplink-only compression algorithms.

02

Introduces a memory mechanism to control model perturbations.

03

Enables incorporation of worker-dependent randomized models.

Abstract

We develop a new approach to tackle communication constraints in a distributed learning problem with a central server. We propose and analyze a new algorithm that performs bidirectional compression and achieves the same convergence rate as algorithms using only uplink (from the local workers to the central server) compression. To obtain this improvement, we design MCM, an algorithm such that the downlink compression only impacts local models, while the global model is preserved. As a result, and contrary to previous works, the gradients on local servers are computed on perturbed models. Consequently, convergence proofs are more challenging and require a precise control of this perturbation. To ensure it, MCM additionally combines model compression with a memory mechanism. This analysis opens new doors, e.g. incorporating worker dependent randomized-models and partial participation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Preserved central model for faster bidirectional compression in distributed settings· slideslive

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data · Mobile Crowdsensing and Crowdsourcing