Byzantine-Robust Optimization under $(L_0, L_1)$-Smoothness
Arman Bolatov, Samuel Horv\'ath, Martin Tak\'a\v{c}, Eduard Gorbunov

TL;DR
This paper introduces Byz-NSGDM, a robust distributed optimization algorithm designed to withstand Byzantine attacks under a generalized smoothness condition, with proven convergence and validated effectiveness through experiments.
Contribution
The paper presents Byz-NSGDM, a novel Byzantine-robust stochastic gradient method that handles $(L_0,L_1)$-smoothness, combining momentum normalization and NNM aggregation for improved robustness.
Findings
Achieves $O(K^{-1/4})$ convergence rate with Byzantine robustness.
Effective against various Byzantine attack strategies in experiments.
Robust across different momentum and learning rate settings.
Abstract
We consider distributed optimization under Byzantine attacks in the presence of -smoothness, a generalization of standard -smoothness that captures functions with state-dependent gradient Lipschitz constants. We propose Byz-NSGDM, a normalized stochastic gradient descent method with momentum that achieves robustness against Byzantine workers while maintaining convergence guarantees. Our algorithm combines momentum normalization with Byzantine-robust aggregation enhanced by Nearest Neighbor Mixing (NNM) to handle both the challenges posed by -smoothness and Byzantine adversaries. We prove that Byz-NSGDM achieves a convergence rate of up to a Byzantine bias floor proportional to the robustness coefficient and gradient heterogeneity. Experimental validation on heterogeneous MNIST classification, synthetic -smooth optimization, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning
