An Output Feedback Q-learning Algorithm for Optimal Control of Nonlinear Systems with Koopman Linear Embedding
Victor G. Lopez, Malte Heinrich, Matthias A. M\"uller

TL;DR
This paper introduces an output-feedback Q-learning algorithm for nonlinear systems with Koopman linear embedding, providing strong theoretical guarantees without requiring system models or function approximation.
Contribution
It demonstrates that a known Q-learning algorithm can be applied to nonlinear systems via Koopman embedding, maintaining guarantees similar to LTI systems without needing function approximation.
Findings
The algorithm works with input-output data only.
Theoretical guarantees are preserved for Koopman-embedded nonlinear systems.
Simulation confirms the method's applicability.
Abstract
In the reinforcement learning literature, strong theoretical guarantees have been obtained for algorithms applicable to LTI systems. However, in the nonlinear case only weaker results have been obtained for algorithms that mostly rely on the use of function approximation strategies like, for example, neural networks. In this paper, we study the applicability of a known output-feedback Q-learning algorithm to the class of nonlinear systems that admit a Koopman linear embedding. This algorithm uses only input-output data, and no knowledge of either the system model or the Koopman lifting functions is required. Moreover, no function approximation techniques are used, and the same theoretical guarantees as for LTI systems are preserved. Furthermore, we analyze the performance of the algorithm when the Koopman linear embedding is only an approximation of the real nonlinear system. A…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
