A policy iteration algorithm for non-Markovian control problems
Dylan Possama\"i, Ludovic Tangpi

TL;DR
This paper introduces a new policy iteration algorithm for continuous-time stochastic control problems, capable of handling non-Markovian dynamics and volatility control, with proven exponential convergence using probabilistic methods.
Contribution
The paper presents a novel policy iteration method that converges exponentially for non-Markovian control problems, including multi-dimensional volatility control, using simpler probabilistic proofs.
Findings
Algorithm achieves exponential convergence for value and controls.
Applicable to non-Markovian dynamics and multi-dimensional volatility control.
Recovers standard convergence speed with explicit solutions via linear PDEs.
Abstract
In this paper, we propose a new policy iteration algorithm to compute the value function and the optimal controls of continuous time stochastic control problems. The algorithm relies on successive approximations using linear-quadratic control problems which can all be solved explicitly, and only require to solve recursively linear PDEs in the Markovian case. Though our procedure fails in general to produce a non-decreasing sequence like the standard algorithm, it can be made arbitrarily close to being monotone. More importantly, we recover the standard exponential speed of convergence for both the value and the controls, through purely probabilistic arguments which are significantly simpler than in the classical case. Our proof also accommodates non-Markovian dynamics as well as volatility control, allowing us to obtain the first convergence results in the latter case for a state…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Control Systems Optimization · Optimization and Search Problems
