Discretization error from regularized Reinforcement Learning to continuous-time stochastic control

Huy\^en Pham; Yuming Paul Zhang; Yuhua Zhu

arXiv:2604.21179·math.OC·April 24, 2026

Discretization error from regularized Reinforcement Learning to continuous-time stochastic control

Huy\^en Pham, Yuming Paul Zhang, Yuhua Zhu

PDF

TL;DR

This paper rigorously analyzes the discretization error in regularized reinforcement learning when approximating continuous-time stochastic control, providing convergence rates and stability insights.

Contribution

It establishes a quantitative connection between discrete RL algorithms and continuous stochastic control, with explicit error bounds and convergence analysis.

Findings

01

Derived convergence rates for discretization error

02

Provided stability analysis for RL policies in continuous environments

03

Quantified the gap between discrete optimal policies and continuous control

Abstract

This paper establishes a rigorous connection between regularized discrete-time reinforcement learning (RL) and continuous-time stochastic optimal control. Specifically, classical RL algorithms are typically solving a regularized discrete-time Bellman equation. We study the discretization error, namely, the gap between the optimal policy induced by the regularized discrete-time Bellman equation and the true optimal feedback control of the underlying continuous-time stochastic control problem. By deriving quantitative convergence rates for this gap, we provide a rigorous foundation for understanding the stability and implementation of exploratory RL policies in stochastic continuous-time environments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.