Communication Compression for Byzantine Robust Learning: New Efficient   Algorithms and Improved Rates

Ahmad Rammal; Kaja Gruntkowska; Nikita Fedin; Eduard Gorbunov; Peter; Richt\'arik

arXiv:2310.09804·math.OC·March 12, 2024·1 cites

Communication Compression for Byzantine Robust Learning: New Efficient Algorithms and Improved Rates

Ahmad Rammal, Kaja Gruntkowska, Nikita Fedin, Eduard Gorbunov, Peter, Richt\'arik

PDF

Open Access 1 Repo

TL;DR

This paper introduces new Byzantine-robust algorithms with communication compression, achieving improved convergence rates and robustness in large-scale distributed and federated learning scenarios.

Contribution

It proposes two novel algorithms, Byz-DASHA-PAGE and Byz-EF21, with enhanced theoretical guarantees and practical performance for Byzantine-robust learning with compression.

Findings

01

Byz-DASHA-PAGE has better convergence rates than previous methods.

02

Byz-EF21 and Byz-EF21-BC demonstrate effective communication compression with error feedback.

03

Experimental results confirm the theoretical improvements.

Abstract

Byzantine robustness is an essential feature of algorithms for certain distributed optimization problems, typically encountered in collaborative/federated learning. These problems are usually huge-scale, implying that communication compression is also imperative for their resolution. These factors have spurred recent algorithmic and theoretical developments in the literature of Byzantine-robust learning with compression. In this paper, we contribute to this research area in two main directions. First, we propose a new Byzantine-robust method with compression - Byz-DASHA-PAGE - and prove that the new method has better convergence rate (for non-convex and Polyak-Lojasiewicz smooth optimization problems), smaller neighborhood size in the heterogeneous case, and tolerates more Byzantine workers under over-parametrization than the previous method with SOTA theoretical convergence guarantees…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nikosimus/cc-for-br-learning
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data · Cooperative Communication and Network Coding