Stability and Generalization for Distributed SGDA
Miaoxi Zhu, Yan Sun, Li Shen, Bo Du, Dacheng Tao

TL;DR
This paper develops a stability-based framework to analyze the generalization performance of distributed minimax algorithms like Local-SGDA and Local-DSGDA, providing insights into their trade-offs and optimal hyperparameters.
Contribution
It introduces a unified stability analysis framework for distributed minimax algorithms, addressing their generalization performance which was previously underexplored.
Findings
Theoretical bounds on stability error and generalization gap.
Trade-off analysis between generalization gap and optimization error.
Validation of theoretical results through numerical experiments.
Abstract
Minimax optimization is gaining increasing attention in modern machine learning applications. Driven by large-scale models and massive volumes of data collected from edge devices, as well as the concern to preserve client privacy, communication-efficient distributed minimax optimization algorithms become popular, such as Local Stochastic Gradient Descent Ascent (Local-SGDA), and Local Decentralized SGDA (Local-DSGDA). While most existing research on distributed minimax algorithms focuses on convergence rates, computation complexity, and communication efficiency, the generalization performance remains underdeveloped, whereas generalization ability is a pivotal indicator for evaluating the holistic performance of a model when fed with unknown data. In this paper, we propose the stability-based generalization analytical framework for Distributed-SGDA, which unifies two popular distributed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Control Systems Optimization
MethodsSoftmax · Attention Is All You Need
