PhiBE: A PDE-based Bellman Equation for Continuous Time Policy Evaluation

Yuhua Zhu

arXiv:2405.12535·math.OC·February 23, 2026

PhiBE: A PDE-based Bellman Equation for Continuous Time Policy Evaluation

Yuhua Zhu

PDF

Open Access

TL;DR

This paper introduces PhiBE, a PDE-based Bellman equation for continuous-time policy evaluation in RL, offering more accurate value function approximation and improved sample complexity by leveraging the dynamics' smoothness.

Contribution

The paper develops PhiBE, a novel PDE-based Bellman equation for continuous-time RL, with theoretical guarantees and a model-free algorithm that improves sample complexity and handles model misspecification.

Findings

01

PhiBE provides a more accurate approximation than traditional discrete Bellman equations.

02

The model-free algorithm for PhiBE converges with finite-sample guarantees.

03

Sample complexity is improved to O(Δt^{-1}) by exploiting dynamics' smoothness.

Abstract

In this paper, we study policy evaluation in continuous-time reinforcement learning (RL), where the state follows an unknown stochastic differential equation (SDE), but only discrete-time data are available. We first highlight that the discrete-time Bellman equation (BE) is not always a reliable approximation to the true value function because it ignores the underlying continuous-time structure. We then introduce a new Bellman equation, PhiBE, which integrates the discrete-time information into a continuous-time PDE formulation. By leveraging the smooth structure of the underlying dynamics, PhiBE provides a provably more accurate approximation to the true value function, especially in scenarios where the underlying dynamics change slowly or the reward oscillates. Moreover, we extend PhiBE to higher orders, providing increasingly accurate approximations. We further develop a model-free…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsClimate Change Policy and Economics