Discretization error from regularized Reinforcement Learning to continuous-time stochastic control
Huy\^en Pham, Yuming Paul Zhang, Yuhua Zhu

TL;DR
This paper rigorously analyzes the discretization error in regularized reinforcement learning when approximating continuous-time stochastic control, providing convergence rates and stability insights.
Contribution
It establishes a quantitative connection between discrete RL algorithms and continuous stochastic control, with explicit error bounds and convergence analysis.
Findings
Derived convergence rates for discretization error
Provided stability analysis for RL policies in continuous environments
Quantified the gap between discrete optimal policies and continuous control
Abstract
This paper establishes a rigorous connection between regularized discrete-time reinforcement learning (RL) and continuous-time stochastic optimal control. Specifically, classical RL algorithms are typically solving a regularized discrete-time Bellman equation. We study the discretization error, namely, the gap between the optimal policy induced by the regularized discrete-time Bellman equation and the true optimal feedback control of the underlying continuous-time stochastic control problem. By deriving quantitative convergence rates for this gap, we provide a rigorous foundation for understanding the stability and implementation of exploratory RL policies in stochastic continuous-time environments.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
