Nesterov-Accelerated Robust Federated Learning Over Byzantine Adversaries
Lihan Xu, Yanjie Dong, Gang Wang, Runhao Zeng, Xiaoyi Fan, and Xiping Hu

TL;DR
This paper introduces Byrd-NAFL, a federated learning algorithm that combines Nesterov acceleration with Byzantine-resilient aggregation, achieving fast, robust convergence against malicious adversaries while maintaining communication efficiency.
Contribution
The paper proposes Byrd-NAFL, integrating Nesterov momentum with Byzantine-resilient aggregation rules, providing the first finite-time convergence guarantee for such robust federated learning under non-convex conditions.
Findings
Byrd-NAFL outperforms existing methods in convergence speed.
It demonstrates high accuracy despite Byzantine attacks.
The method is effective across various attack strategies.
Abstract
We investigate robust federated learning, where a group of workers collaboratively train a shared model under the orchestration of a central server in the presence of Byzantine adversaries capable of arbitrary and potentially malicious behaviors. To simultaneously enhance communication efficiency and robustness against such adversaries, we propose a Byzantine-resilient Nesterov-Accelerated Federated Learning (Byrd-NAFL) algorithm. Byrd-NAFL seamlessly integrates Nesterov's momentum into the federated learning process alongside Byzantine-resilient aggregation rules to achieve fast and safeguarding convergence against gradient corruption. We establish a finite-time convergence guarantee for Byrd-NAFL under non-convex and smooth loss functions with relaxed assumption on the aggregated gradients. Extensive numerical experiments validate the effectiveness of Byrd-NAFL and demonstrate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Adversarial Robustness in Machine Learning
