A Relative Value Iteration Algorithm for Non-degenerate Controlled   Diffusions

Ari Arapostathis; Vivek S. Borkar

arXiv:1110.1273·math.OC·March 20, 2019·SIAM J. Control. Optim.

A Relative Value Iteration Algorithm for Non-degenerate Controlled Diffusions

Ari Arapostathis, Vivek S. Borkar

PDF

TL;DR

This paper introduces a continuous-time relative value iteration algorithm for solving ergodic control problems in non-degenerate controlled diffusions, establishing its convergence to the Hamilton-Jacobi-Bellman equation solution.

Contribution

It develops a nonlinear parabolic evolution equation as a continuous analog of White's relative value iteration for ergodic control, with proven convergence under stability conditions.

Findings

01

Convergence of the proposed algorithm to the HJB solution is established.

02

The method applies to non-degenerate controlled diffusions with drift control.

03

Uses monotone dynamical systems and reverse martingale theory for proof.

Abstract

The ergodic control problem for a non-degenerate controlled diffusion controlled through its drift is considered under a uniform stability condition that ensures the well-posedness of the associated Hamilton-Jacobi-Bellman (HJB) equation. A nonlinear parabolic evolution equation is then proposed as a continuous time continuous state space analog of White's `relative value iteration' algorithm for solving the ergodic dynamic programming equation for the finite state finite action case. Its convergence to the solution of the HJB equation is established using the theory of monotone dynamical systems and also, alternatively, by using the theory of reverse martingales.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.