On the policy improvement algorithm in continuous time
Saul D. Jacka, Aleksandar Mijatovi\'c

TL;DR
This paper develops a general framework for the Policy Improvement Algorithm in continuous-time stochastic control, establishing conditions for its convergence without discretization and highlighting the importance of weak control formulations.
Contribution
It introduces a broad approach to continuous-time PIA, providing convergence conditions and illustrating its applicability to weak stochastic control problems.
Findings
PIA can be well-defined and convergent in continuous time under certain conditions.
Weak stochastic control is the natural setting for continuous-time PIA.
Examples demonstrate the necessity of weak formulation for control problems.
Abstract
We develop a general approach to the Policy Improvement Algorithm (PIA) for stochastic control problems for continuous-time processes. The main results assume only that the controls lie in a compact metric space and give general sufficient conditions for the PIA to be well-defined and converge in continuous time (i.e. without time discretisation). It emerges that the natural context for the PIA in continuous time is weak stochastic control. We give examples of control problems demonstrating the need for the weak formulation as well as diffusion-based classes of problems where the PIA in continuous time is applicable.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic processes and financial applications · Risk and Portfolio Optimization · Markov Chains and Monte Carlo Methods
