ByRDiE: Byzantine-resilient distributed coordinate descent for   decentralized learning

Zhixiong Yang; Waheed U. Bajwa

arXiv:1708.08155·cs.LG·July 7, 2020

ByRDiE: Byzantine-resilient distributed coordinate descent for decentralized learning

Zhixiong Yang, Waheed U. Bajwa

PDF

TL;DR

This paper introduces ByRDiE, a new distributed coordinate descent algorithm designed to be resilient against Byzantine failures, enabling reliable high-dimensional decentralized learning despite malicious or faulty network nodes.

Contribution

The paper presents the first practical Byzantine-resilient distributed coordinate descent algorithm for high-dimensional decentralized learning, with theoretical guarantees and empirical validation.

Findings

01

ByRDiE achieves Byzantine resilience in distributed learning.

02

The algorithm performs well in convex and nonconvex settings.

03

Numerical experiments confirm its effectiveness in high-dimensional scenarios.

Abstract

Distributed machine learning algorithms enable learning of models from datasets that are distributed over a network without gathering the data at a centralized location. While efficient distributed algorithms have been developed under the assumption of faultless networks, failures that can render these algorithms nonfunctional occur frequently in the real world. This paper focuses on the problem of Byzantine failures, which are the hardest to safeguard against in distributed algorithms. While Byzantine fault tolerance has a rich history, existing work does not translate into efficient and practical algorithms for high-dimensional learning in fully distributed (also known as decentralized) settings. In this paper, an algorithm termed Byzantine-resilient distributed coordinate descent (ByRDiE) is developed and analyzed that enables distributed learning in the presence of Byzantine…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.