SAGDA: Achieving $\mathcal{O}(\epsilon^{-2})$ Communication Complexity in Federated Min-Max Learning
Haibo Yang, Zhuqing Liu, Xin Zhang, Jia Liu

TL;DR
This paper introduces SAGDA, a new algorithm for federated min-max learning that significantly reduces communication complexity to $ ext{O}(rac{1}{ ext{epsilon}^2})$, even with non-i.i.d. data and partial client participation.
Contribution
The paper proposes SAGDA, a novel algorithmic framework that achieves optimal communication complexity for federated min-max learning, extending theoretical understanding and practical efficiency.
Findings
SAGDA attains $ ext{O}(rac{1}{ ext{epsilon}^2})$ communication complexity.
SAGDA achieves linear speedup with respect to the number of clients and local updates.
Standard FSGDA is a special case of SAGDA, inheriting its communication complexity results.
Abstract
To lower the communication complexity of federated min-max learning, a natural approach is to utilize the idea of infrequent communications (through multiple local updates) same as in conventional federated learning. However, due to the more complicated inter-outer problem structure in federated min-max learning, theoretical understandings of communication complexity for federated min-max learning with infrequent communications remain very limited in the literature. This is particularly true for settings with non-i.i.d. datasets and partial client participation. To address this challenge, in this paper, we propose a new algorithmic framework called stochastic sampling averaging gradient descent ascent (SAGDA), which i) assembles stochastic gradient estimators from randomly sampled clients as control variates and ii) leverages two learning rates on both server and client sides. We show…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques
