BEER: Fast $O(1/T)$ Rate for Decentralized Nonconvex Optimization with Communication Compression
Haoyu Zhao, Boyue Li, Zhize Li, Peter Richt\'arik, Yuejie Chi

TL;DR
This paper introduces BEER, a communication-compressed decentralized optimization algorithm that achieves an optimal $O(1/T)$ convergence rate, significantly improving over previous methods especially in heterogeneous data settings.
Contribution
BEER is the first algorithm to attain an $O(1/T)$ convergence rate with communication compression in nonconvex decentralized optimization, matching uncompressed rates even with data heterogeneity.
Findings
BEER converges at an $O(1/T)$ rate, faster than previous $O((G/T)^{2/3})$ rates.
Numerical results confirm BEER's superior performance in heterogeneous data scenarios.
BEER effectively combines communication compression with gradient tracking for improved efficiency.
Abstract
Communication efficiency has been widely recognized as the bottleneck for large-scale decentralized machine learning applications in multi-agent or federated environments. To tackle the communication bottleneck, there have been many efforts to design communication-compressed algorithms for decentralized nonconvex optimization, where the clients are only allowed to communicate a small amount of quantized information (aka bits) with their neighbors over a predefined graph topology. Despite significant efforts, the state-of-the-art algorithm in the nonconvex setting still suffers from a slower rate of convergence compared with their uncompressed counterpart, where measures the data heterogeneity across different clients, and is the number of communication rounds. This paper proposes BEER, which adopts communication compression with gradient tracking, and shows it…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Energy Efficient Wireless Sensor Networks
