A simplified convergence theory for Byzantine resilient stochastic   gradient descent

Lindon Roberts; Edward Smyth

arXiv:2208.11879·cs.LG·August 26, 2022

A simplified convergence theory for Byzantine resilient stochastic gradient descent

Lindon Roberts, Edward Smyth

PDF

TL;DR

This paper provides a simplified convergence analysis for Byzantine resilient stochastic gradient descent, demonstrating convergence to stationary points even with malicious nodes in distributed learning.

Contribution

It offers a more accessible convergence theory for Byzantine resilient SGD, extending previous work to nonconvex functions and flexible gradient assumptions.

Findings

01

Convergence to stationary points under standard assumptions.

02

Applicable to nonconvex optimization problems.

03

Supports flexible stochastic gradient conditions.

Abstract

In distributed learning, a central server trains a model according to updates provided by nodes holding local data samples. In the presence of one or more malicious servers sending incorrect information (a Byzantine adversary), standard algorithms for model training such as stochastic gradient descent (SGD) fail to converge. In this paper, we present a simplified convergence theory for the generic Byzantine Resilient SGD method originally proposed by Blanchard et al. [NeurIPS 2017]. Compared to the existing analysis, we shown convergence to a stationary point in expectation under standard assumptions on the (possibly nonconvex) objective function and flexible assumptions on the stochastic gradients.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsStochastic Gradient Descent