The Policy Iteration Algorithm for Average Continuous Control of Piecewise Deterministic Markov Processes
O.L.V. Costa, F. Dufour

TL;DR
This paper extends the policy iteration algorithm to solve long-term average control problems for piecewise deterministic Markov processes with general state spaces, establishing convergence and optimality conditions.
Contribution
It introduces a novel application of policy iteration to continuous-time PDMPs, deriving key properties and proving convergence to an optimal control strategy.
Findings
Proves convergence of the policy iteration algorithm for PDMPs.
Establishes the existence of an optimal feedback control strategy.
Provides theoretical foundations for long-term average control of PDMPs.
Abstract
The main goal of this paper is to apply the so-called policy iteration algorithm (PIA) for the long run average continuous control problem of piecewise deterministic Markov processes (PDMP's) taking values in a general Borel space and with compact action space depending on the state variable. In order to do that we first derive some important properties for a pseudo-Poisson equation associated to the problem. In the sequence it is shown that the convergence of the PIA to a solution satisfying the optimality equation holds under some classical hypotheses and that this optimal solution yields to an optimal control strategy for the average control problem for the continuous-time PDMP in a feedback form.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
