High-Probability Bounds for SGD under the Polyak-Lojasiewicz Condition with Markovian Noise

Avik Kar; Siddharth Chandak; Rahul Singh; Eric Moulines; Shalabh Bhatnagar; Nicholas Bambos

arXiv:2603.14514·cs.LG·March 17, 2026

High-Probability Bounds for SGD under the Polyak-Lojasiewicz Condition with Markovian Noise

Avik Kar, Siddharth Chandak, Rahul Singh, Eric Moulines, Shalabh Bhatnagar, Nicholas Bambos

PDF

Open Access

TL;DR

This paper establishes the first uniform-in-time high-probability bounds for stochastic gradient descent under the Polyak-Lojasiewicz condition with Markovian noise, broadening finite-time guarantees in machine learning contexts.

Contribution

It introduces a novel analysis framework for SGD with Markovian noise under the PL condition, including growth of noise with function value, and demonstrates applicability to practical problems.

Findings

01

High-probability bounds for SGD under PL condition with Markovian noise.

02

Expected suboptimality decays at a rate of 1/k.

03

Applicable to decentralized linear regression, privacy-preserving learning, and online system identification.

Abstract

We present the first uniform-in-time high-probability bound for SGD under the PL condition, where the gradient noise contains both Markovian and martingale difference components. This significantly broadens the scope of finite-time guarantees, as the PL condition arises in many machine learning and deep learning models while Markovian noise naturally arises in decentralized optimization and online system identification problems. We further allow the magnitude of noise to grow with the function value, enabling the analysis of many practical sampling strategies. In addition to the high-probability guarantee, we establish a matching $1/ k$ decay rate for the expected suboptimality. Our proof technique relies on the Poisson equation to handle the Markovian noise and a probabilistic induction argument to address the lack of almost-sure bounds on the objective. Finally, we demonstrate the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Privacy-Preserving Technologies in Data · Advanced Bandit Algorithms Research