Tailoring Reproducing Kernels for Optimal Control via Policy Iteration
Shengyuan Niu, Ali Bouland, Haoran Wang, Filippos Fotiadis, Andrew Kurdila, Andrea L'Afflitto, Sai Tej Paruchuri, Kyriakos G. Vamvoudakis

TL;DR
This paper introduces a method to optimize actor-critic algorithms for nonlinear control by customizing reproducing kernels and RKHSs, providing formal error bounds and practical basis selection strategies.
Contribution
It develops a novel RKHS-based framework for policy iteration in optimal control, with guaranteed error bounds and basis selection methods tailored to system dynamics.
Findings
Provides formal, computable error bounds for policy iteration steps
Demonstrates practical effectiveness through numerical experiments
Introduces a basis selection strategy for actor-critic networks
Abstract
This paper presents a novel approach to formulating the actor-critic method for optimal control by casting policy iteration in reproducing kernel Hilbert spaces (RKHSs -- also known as native spaces). By tailoring the reproducing kernel and RKHS to the dynamics of the nonlinear optimal control problem, we leverage recent advancements in characterizing error bounds from statistical and machine learning theory. These approximations define a general strategy to select the bases of the actor-critic networks, and we formally guarantee for the first time that this basis selection procedure leads to closed-form error bounds for the individual steps of policy iteration. These bounds often have a geometric and computable form, making them potentially useful for a priori or a posteriori evaluation of candidate collections of scattered bases. Numerical studies subsequently provide qualitative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
