Shuffling the Stochastic Mirror Descent via Dual Lipschitz Continuity and Kernel Conditioning

Junwen Qiu; Leilei Mei; Junyu Zhang

arXiv:2603.16042·math.OC·March 18, 2026

Shuffling the Stochastic Mirror Descent via Dual Lipschitz Continuity and Kernel Conditioning

Junwen Qiu, Leilei Mei, Junyu Zhang

PDF

Open Access

TL;DR

This paper introduces dual kernel conditioning to extend convergence analysis of stochastic mirror descent algorithms to non-Lipschitz smooth settings, providing new complexity bounds and convergence guarantees.

Contribution

It proposes the dual kernel conditioning (DKC) condition, enabling analysis of stochastic mirror descent without primal Lipschitz smoothness, and establishes convergence results in this broader context.

Findings

01

DKC is widely satisfied by common kernels.

02

DKC is closed under affine and conic operations.

03

First complexity bounds for random reshuffling mirror descent in non-Lipschitz settings.

Abstract

The global Lipschitz smoothness condition underlies most convergence and complexity analyses via two key consequences: the descent lemma and the gradient Lipschitz continuity. How to study the performance of optimization algorithms in the absence of Lipschitz smoothness remains an active area. The relative smoothness framework from Bauschke-Bolte-Teboulle (2017) and Lu-Freund-Nesterov (2018) provides an extended descent lemma, ensuring convergence of Bregman-based proximal gradient methods and their vanilla stochastic counterparts. However, many widely used techniques (e.g., momentum schemes, random reshuffling, and variance reduction) additionally require the Lipschitz-type bound for gradient deviations, leaving their analysis under relative smoothness an open area. To resolve this issue, we introduce the dual kernel conditioning (DKC) regularity condition to regulate the local…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Markov Chains and Monte Carlo Methods