Error Feedback Reloaded: From Quadratic to Arithmetic Mean of Smoothness   Constants

Peter Richt\'arik; Elnur Gasanov; Konstantin Burlachenko

arXiv:2402.10774·cs.LG·February 19, 2024·1 cites

Error Feedback Reloaded: From Quadratic to Arithmetic Mean of Smoothness Constants

Peter Richt\'arik, Elnur Gasanov, Konstantin Burlachenko

PDF

Open Access 1 Video

TL;DR

This paper improves the theoretical understanding of Error Feedback (EF21) in distributed training by reducing the dependence of communication complexity from quadratic to arithmetic mean of smoothness constants, enhancing efficiency especially in heterogeneous data scenarios.

Contribution

We refine the analysis of EF21, replacing quadratic mean dependence with arithmetic mean, and introduce a new weighted EF21 variant that avoids impractical cloning, extending to stochastic and partial participation settings.

Findings

01

Reduced communication complexity dependence from quadratic to arithmetic mean.

02

Developed a cloning-free weighted EF21 variant.

03

Validated theoretical improvements with experiments.

Abstract

Error Feedback (EF) is a highly popular and immensely effective mechanism for fixing convergence issues which arise in distributed training methods (such as distributed GD or SGD) when these are enhanced with greedy communication compression techniques such as TopK. While EF was proposed almost a decade ago (Seide et al., 2014), and despite concentrated effort by the community to advance the theoretical understanding of this mechanism, there is still a lot to explore. In this work we study a modern form of error feedback called EF21 (Richtarik et al., 2021) which offers the currently best-known theoretical guarantees, under the weakest assumptions, and also works well in practice. In particular, while the theoretical communication complexity of EF21 depends on the quadratic mean of certain smoothness parameters, we improve this dependence to their arithmetic mean, which is always…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Error Feedback Reloaded: From Quadratic to Arithmetic Mean of Smoothness Constants· slideslive

Taxonomy

TopicsStatistical and numerical algorithms · Control Systems and Identification