Byzantine Failures Harm the Generalization of Robust Distributed Learning Algorithms More Than Data Poisoning

Thomas Boudou; Batiste Le Bars; Nirupam Gupta; Aur\'elien Bellet

arXiv:2506.18020·cs.LG·October 17, 2025

Byzantine Failures Harm the Generalization of Robust Distributed Learning Algorithms More Than Data Poisoning

Thomas Boudou, Batiste Le Bars, Nirupam Gupta, Aur\'elien Bellet

PDF

TL;DR

This paper demonstrates that Byzantine failures cause more severe generalization issues in robust distributed learning than data poisoning, revealing a fundamental gap through a stability analysis.

Contribution

It provides the first theoretical comparison showing Byzantine failures lead to worse generalization degradation than data poisoning in distributed learning.

Findings

01

Byzantine failures cause strictly worse generalization degradation than data poisoning.

02

The degradation under data poisoning is additive and proportional to f/(n-f).

03

The degradation under Byzantine failures is at least proportional to the square root of f/(n-2f).

Abstract

Robust distributed learning algorithms aim to maintain reliable performance despite the presence of misbehaving workers. Such misbehaviors are commonly modeled as Byzantine failures, allowing arbitrarily corrupted communication, or as data poisoning, a weaker form of corruption restricted to local training data. While prior work shows similar optimization guarantees for both models, an important question remains: How do these threat models impact generalization? Empirical evidence suggests a gap, yet it remains unclear whether it is unavoidable or merely an artifact of suboptimal attacks. We show, for the first time, a fundamental gap in generalization guarantees between the two threat models: Byzantine failures yield strictly worse rates than those achievable under data poisoning. Our findings leverage a tight algorithmic stability analysis of robust distributed learning. Specifically,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.