Data-Driven Continuous-Time Linear Quadratic Regulator via Closed-Loop and Reinforcement Learning Parameterizations
Armin Gie{\ss}ler, Felix Th\"ommes, S\"oren Hohmann

TL;DR
This paper develops data-driven methods for continuous-time LQR using closed-loop and reinforcement learning parameterizations, including policy iteration, Riccati equations, and convex reformulations.
Contribution
It adapts discrete-time CL parameterization to continuous-time, introduces a data-driven Riccati equation, and unifies different approaches for systematic understanding.
Findings
Developed a continuous-time policy iteration scheme.
Derived a data-driven algebraic Riccati equation.
Provided convex reformulations and a unified framework.
Abstract
This paper studies data-driven approaches to the continuous-time linear quadratic regulator (LQR) problem based on two existing parameterizations, namely a closed-loop (CL) parameterization from behavioral system theory and an integral reinforcement learning (IRL) parameterization. The CL parameterization characterizes the closed-loop system via a matrix that satisfies equality constraints. While this parameterization has been extensively studied for discrete-time systems, we adapt key results to the continuous-time setting and develop a policy iteration (PI) scheme, derive a data-driven continuous-time algebraic Riccati equation (CARE), and introduce an alternative convex problem formulation. The IRL parameterization utilizes off-policy data to perform policy evaluation, which is then used for PI or value iteration. Within the IRL framework, we derive a policy gradient flow and propose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
