On the Byzantine Fault Tolerance of signSGD with Majority Vote
Emanuele Mengoli, Luzius Moll, Virgilio Strozzi, El-Mahdi El-Mhamdi

TL;DR
This paper analyzes the robustness of signSGD with majority vote in distributed learning, demonstrating its resilience against omniscient and colluding Byzantine adversaries through theoretical bounds and empirical validation.
Contribution
It provides the first comprehensive proof of signSGD's fault tolerance against the strongest adversaries, including omniscient and colluding attackers, with explicit probabilistic bounds.
Findings
SignSGD with majority vote maintains convergence despite Byzantine attacks.
Theoretical bounds quantify the maximum damage from adversaries.
Experimental results on MNIST validate the robustness claims.
Abstract
In distributed learning, sign-based compression algorithms such as signSGD with majority vote provide a lightweight alternative to SGD with an additional advantage: fault tolerance (almost) for free. However, for signSGD with majority vote, this fault tolerance has been shown to cover only the case of weaker adversaries, i.e., ones that are not omniscient or cannot collude to base their attack on common knowledge and strategy. In this work, we close this gap and provide new insights into how signSGD with majority vote can be resilient against omniscient and colluding adversaries, which craft an attack after communicating with other adversaries, thus having better information to perform the most damaging attack based on a common optimal strategy. Our core contribution is in providing a proof that begins by defining the omniscience framework and the strongest possible damage against…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Privacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques
MethodsStochastic Gradient Descent · Balanced Selection
