A Relative Value Iteration Algorithm for Non-degenerate Controlled Diffusions
Ari Arapostathis, Vivek S. Borkar

TL;DR
This paper introduces a continuous-time relative value iteration algorithm for solving ergodic control problems in non-degenerate controlled diffusions, establishing its convergence to the Hamilton-Jacobi-Bellman equation solution.
Contribution
It develops a nonlinear parabolic evolution equation as a continuous analog of White's relative value iteration for ergodic control, with proven convergence under stability conditions.
Findings
Convergence of the proposed algorithm to the HJB solution is established.
The method applies to non-degenerate controlled diffusions with drift control.
Uses monotone dynamical systems and reverse martingale theory for proof.
Abstract
The ergodic control problem for a non-degenerate controlled diffusion controlled through its drift is considered under a uniform stability condition that ensures the well-posedness of the associated Hamilton-Jacobi-Bellman (HJB) equation. A nonlinear parabolic evolution equation is then proposed as a continuous time continuous state space analog of White's `relative value iteration' algorithm for solving the ergodic dynamic programming equation for the finite state finite action case. Its convergence to the solution of the HJB equation is established using the theory of monotone dynamical systems and also, alternatively, by using the theory of reverse martingales.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
