Stability and Generalization for Decentralized Markov SGD

Jiahuan Wang; Ziqing Wen; Ping Luo; Dongsheng Li; Tao Sun

arXiv:2605.01701·cs.LG·May 5, 2026

Stability and Generalization for Decentralized Markov SGD

Jiahuan Wang, Ziqing Wen, Ping Luo, Dongsheng Li, Tao Sun

PDF

TL;DR

This paper analyzes how decentralized Markov chain sampling affects the stability and generalization of stochastic gradient methods, providing new theoretical bounds in complex networked learning scenarios.

Contribution

It introduces a stability-based framework to characterize the impact of Markov dependence and decentralization on generalization in SGD and SGDA.

Findings

01

Established non-asymptotic generalization bounds for decentralized Markov SGD and SGDA.

02

Extended existing Markov stochastic gradient results to decentralized and minimax settings.

03

Analyzed effects of network topology and Markov chain mixing on learning stability.

Abstract

Stochastic gradient methods are central to large-scale learning, yet their generalization theory typically relies on independent sampling assumptions. In many practical applications, data are generated by Markov chains and learning is performed in a decentralized manner, which introduces significant analytical challenges. In this work, we investigate the stability and generalization of decentralized stochastic gradient descent (SGD) and stochastic gradient descent ascent (SGDA) under Markov chain sampling. Leveraging a stability-based framework, we characterize how Markovian dependence and decentralized communication jointly influence generalization behavior. Our analysis captures the effects of network topology, Markov chain mixing properties, and primal-dual dynamics. We establish non-asymptotic generalization bounds for both algorithms, extending existing results on Markov stochastic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.