Convergence Analysis of Policy Iteration
Ali Heydari

TL;DR
This paper analyzes the convergence properties of policy iteration in optimal control of nonlinear systems, establishing conditions for convergence and optimality, and comparing its speed to value iteration, including multi-step look-ahead extensions.
Contribution
It provides a rigorous convergence analysis of policy iteration starting from a stabilizing control, and compares its convergence speed to value iteration, extending results to multi-step look-ahead methods.
Findings
Policy iteration converges to the optimal solution under certain conditions.
The convergence speed of policy iteration is compared to value iteration.
Results are extended to multi-step look-ahead policy iteration.
Abstract
Adaptive optimal control of nonlinear dynamic systems with deterministic and known dynamics under a known undiscounted infinite-horizon cost function is investigated. Policy iteration scheme initiated using a stabilizing initial control is analyzed in solving the problem. The convergence of the iterations and the optimality of the limit functions, which follows from the established uniqueness of the solution to the Bellman equation, are the main results of this study. Furthermore, a theoretical comparison between the speed of convergence of policy iteration versus value iteration is presented. Finally, the convergence results are extended to the case of multi-step look-ahead policy iteration.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdaptive Dynamic Programming Control · Frequency Control in Power Systems · Optimization and Variational Analysis
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
