Sharp asymptotic theory for Q-learning with LDTZ learning rate and its generalization

Soham Bonnerjee; Zhipeng Lou; and Wei Biao Wu

arXiv:2604.04218·stat.ML·April 7, 2026

Sharp asymptotic theory for Q-learning with LDTZ learning rate and its generalization

Soham Bonnerjee, Zhipeng Lou, and Wei Biao Wu

PDF

TL;DR

This paper develops a comprehensive theoretical framework for Q-learning with a class of decaying learning rates, including the recently popular LD2Z schedule, providing sharp error bounds, CLT, and Gaussian approximation results.

Contribution

It introduces a unified analysis for power-law decay schedules in Q-learning, establishing their statistical properties and inference capabilities.

Findings

01

Sharp non-asymptotic error bounds for Q-learning with PD2Z-$ u$.

02

Central limit theorem for tail Polyak-Ruppert averaging estimator.

03

Time-uniform Gaussian approximation for Q-learning iterates.

Abstract

Despite the sustained popularity of Q-learning as a practical tool for policy determination, a majority of relevant theoretical literature deals with either constant ( $η_{t} \equiv η$ ) or polynomially decaying ( $η_{t} = η t^{- α}$ ) learning schedules. However, it is well known that these choices suffer from either persistent bias or prohibitively slow convergence. In contrast, the recently proposed linear decay to zero (\texttt{LD2Z}: $η_{t, n} = η (1 - t / n)$ ) schedule has shown appreciable empirical performance, but its theoretical and statistical properties remain largely unexplored, especially in the Q-learning setting. We address this gap in the literature by first considering a general class of power-law decay to zero (\texttt{PD2Z}- $ν$ : $η_{t, n} = η (1 - t / n)^{ν}$ ). Proceeding step-by-step, we present a sharp non-asymptotic error bound for Q-learning with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.