Federated Stochastic Minimax Optimization under Heavy-Tailed Noises
Xinwen Zhang, Hongchang Gao

TL;DR
This paper introduces two federated minimax optimization algorithms designed to handle heavy-tailed gradient noise, providing the first theoretical convergence guarantees in this challenging setting, validated by extensive experiments.
Contribution
Proposes two novel algorithms, Fed-NSGDA-M and FedMuon-DA, with theoretical convergence guarantees for federated minimax optimization under heavy-tailed noise.
Findings
Achieve convergence rate of O(1/(TNp)^{(s-1)/(2s)})
First algorithms with rigorous guarantees under heavy-tailed noise in federated minimax
Experimental results confirm effectiveness of the proposed methods.
Abstract
Heavy-tailed noise has attracted growing attention in nonconvex stochastic optimization, as numerous empirical studies suggest it offers a more realistic assumption than standard bounded variance assumption. In this work, we investigate nonconvex-PL minimax optimization under heavy-tailed gradient noise in federated learning. We propose two novel algorithms: Fed-NSGDA-M, which integrates normalized gradients, and FedMuon-DA, which leverages the Muon optimizer for local updates. Both algorithms are designed to effectively address heavy-tailed noise in federated minimax optimization, under a milder condition. We theoretically establish that both algorithms achieve a convergence rate of . To the best of our knowledge, these are the first federated minimax optimization algorithms with rigorous theoretical guarantees under heavy-tailed noise. Extensive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data · Sparse and Compressive Sensing Techniques
