On Gaussian approximation for entropy-regularized Q-learning with function approximation

Artemy Rubtsov; Rahul Singh; Eric Moulines; Alexey Naumov; and Sergey Samsonov

arXiv:2605.17678·stat.ML·May 19, 2026

On Gaussian approximation for entropy-regularized Q-learning with function approximation

Artemy Rubtsov, Rahul Singh, Eric Moulines, Alexey Naumov, and Sergey Samsonov

PDF

TL;DR

This paper establishes a Gaussian approximation rate for entropy-regularized Q-learning with function approximation in high dimensions, providing theoretical convergence guarantees under certain conditions.

Contribution

It derives the first high-dimensional Gaussian approximation bounds for entropy-regularized Q-learning with linear function approximation.

Findings

01

Gaussian approximation bound with rate n^{-1/4} up to polylog factors

02

High-order moment bounds for the last iterate of the algorithm

03

Convergence analysis under geometric ergodicity and regularity conditions

Abstract

In this paper, we derive rates of convergence in the high-dimensional central limit theorem for Polyak--Ruppert averaged iterates generated by entropy-regularized asynchronous Q-learning with linear function approximation and a polynomial stepsize $k^{- ω}$ , $ω \in (1/2, 1)$ . Assuming that the sequence of observed triples $(s_{k}, a_{k}, s_{k + 1})_{k \geq 0}$ forms a uniformly geometrically ergodic Markov chain, and under suitable regularity conditions for the projected soft Bellman equation, we establish a Gaussian approximation bound in the convex distance with rate of order $n^{- 1/4}$ , up to polylogarithmic factors in $n$ , where $n$ is the number of samples used by the algorithm. To obtain this result, we combine a linearization of the soft Bellman recursion with a Gaussian approximation for the leading martingale term. Finally, we derive high-order moment bounds for the algorithm's…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.