Preserved central model for faster bidirectional compression in distributed settings
Constantin Philippenko, Aymeric Dieuleveut

TL;DR
This paper introduces MCM, a novel distributed learning algorithm that enables bidirectional compression with preserved global models, achieving efficient communication without sacrificing convergence speed.
Contribution
The paper presents MCM, a new algorithm that allows downlink compression to only affect local models, maintaining global model integrity and convergence rate in distributed learning.
Findings
Achieves same convergence rate as uplink-only compression algorithms.
Introduces a memory mechanism to control model perturbations.
Enables incorporation of worker-dependent randomized models.
Abstract
We develop a new approach to tackle communication constraints in a distributed learning problem with a central server. We propose and analyze a new algorithm that performs bidirectional compression and achieves the same convergence rate as algorithms using only uplink (from the local workers to the central server) compression. To obtain this improvement, we design MCM, an algorithm such that the downlink compression only impacts local models, while the global model is preserved. As a result, and contrary to previous works, the gradients on local servers are computed on perturbed models. Consequently, convergence proofs are more challenging and require a precise control of this perturbation. To ensure it, MCM additionally combines model compression with a memory mechanism. This analysis opens new doors, e.g. incorporating worker dependent randomized-models and partial participation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data · Mobile Crowdsensing and Crowdsourcing
