High-Probability Bounds for SGD under the Polyak-Lojasiewicz Condition with Markovian Noise
Avik Kar, Siddharth Chandak, Rahul Singh, Eric Moulines, Shalabh Bhatnagar, Nicholas Bambos

TL;DR
This paper establishes the first uniform-in-time high-probability bounds for stochastic gradient descent under the Polyak-Lojasiewicz condition with Markovian noise, broadening finite-time guarantees in machine learning contexts.
Contribution
It introduces a novel analysis framework for SGD with Markovian noise under the PL condition, including growth of noise with function value, and demonstrates applicability to practical problems.
Findings
High-probability bounds for SGD under PL condition with Markovian noise.
Expected suboptimality decays at a rate of 1/k.
Applicable to decentralized linear regression, privacy-preserving learning, and online system identification.
Abstract
We present the first uniform-in-time high-probability bound for SGD under the PL condition, where the gradient noise contains both Markovian and martingale difference components. This significantly broadens the scope of finite-time guarantees, as the PL condition arises in many machine learning and deep learning models while Markovian noise naturally arises in decentralized optimization and online system identification problems. We further allow the magnitude of noise to grow with the function value, enabling the analysis of many practical sampling strategies. In addition to the high-probability guarantee, we establish a matching decay rate for the expected suboptimality. Our proof technique relies on the Poisson equation to handle the Markovian noise and a probabilistic induction argument to address the lack of almost-sure bounds on the objective. Finally, we demonstrate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data · Advanced Bandit Algorithms Research
