From Inexact Gradients to Byzantine Robustness: Acceleration and Optimization under Similarity

Renaud Gaucher; Aymeric Dieuleveut; Hadrien Hendrikx

arXiv:2602.03329·cs.LG·February 4, 2026

From Inexact Gradients to Byzantine Robustness: Acceleration and Optimization under Similarity

Renaud Gaucher, Aymeric Dieuleveut, Hadrien Hendrikx

PDF

Open Access

TL;DR

This paper models Byzantine-robust federated learning as an inexact gradient optimization problem, introduces accelerated algorithms, and demonstrates reduced communication complexity both theoretically and empirically.

Contribution

It formulates Byzantine robustness as an inexact gradient optimization problem and proposes accelerated algorithms leveraging this framework, improving convergence speed and communication efficiency.

Findings

01

GD with robust aggregation achieves optimal asymptotic error.

02

Proposed accelerated schemes significantly reduce communication complexity.

03

Theoretical and empirical results confirm faster convergence and efficiency.

Abstract

Standard federated learning algorithms are vulnerable to adversarial nodes, a.k.a. Byzantine failures. To solve this issue, robust distributed learning algorithms have been developed, which typically replace parameter averaging by robust aggregations. While generic conditions on these aggregations exist to guarantee the convergence of (Stochastic) Gradient Descent (SGD), the analyses remain rather ad-hoc. This hinders the development of more complex robust algorithms, such as accelerated ones. In this work, we show that Byzantine-robust distributed optimization can, under standard generic assumptions, be cast as a general optimization with inexact gradient oracles (with both additive and multiplicative error terms), an active field of research. This allows for instance to directly show that GD on top of standard robust aggregation procedures obtains optimal asymptotic error in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Distributed Sensor Networks and Detection Algorithms