On the Convergence of the Policy Iteration for Infinite-Horizon Nonlinear Optimal Control Problems
Tobias Ehring, Behzad Azmi, Bernard Haasdonk

TL;DR
This paper analyzes the convergence of policy iteration for infinite-horizon nonlinear optimal control problems, emphasizing the importance of domain invariance and solution regularity for ensuring well-posedness and convergence.
Contribution
It introduces a constructive method to maintain domain invariance and establishes conditions for the existence of regular GHJB solutions at each iteration.
Findings
The proposed method guarantees forward invariance of the computational domain.
Sufficient conditions for regular GHJB solutions are identified.
Numerical results support the theoretical convergence analysis.
Abstract
Policy iteration (PI) is a widely used algorithm for synthesizing optimal feedback control policies across many engineering and scientific applications. When PI is deployed on infinite-horizon, nonlinear, autonomous optimal-control problems, however, a number of significant theoretical challenges emerge - particularly when the computational state space is restricted to a bounded domain. In this paper, we investigate these challenges and show that the viability of PI in this setting hinges on the existence, uniqueness, and regularity of solutions to the Generalized Hamilton-Jacobi-Bellman (GHJB) equation solved at each iteration. To ensure a well-posed iterative scheme, the GHJB solution must possess sufficient smoothness, and the domain on which the GHJB equation is solved must remain forward-invariant under the closed-loop dynamics induced by the current policy. Although fundamental to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
