Stability-Certified On-Policy Data-Driven LQR via Recursive Learning and Policy Gradient

Lorenzo Sforni; Guido Carnevale; Ivano Notarnicola; Giuseppe Notarstefano

arXiv:2403.05367·eess.SY·April 13, 2026·1 cites

Stability-Certified On-Policy Data-Driven LQR via Recursive Learning and Policy Gradient

Lorenzo Sforni, Guido Carnevale, Ivano Notarnicola, Giuseppe Notarstefano

PDF

TL;DR

This paper presents a data-driven, stability-certified on-policy LQR control method that combines recursive learning and policy gradient techniques, with formal stability guarantees and demonstrated effectiveness on aircraft control simulations.

Contribution

It introduces Relearn LQR, a novel recursive and policy gradient-based approach that provides stability certificates for unknown linear systems during on-policy learning.

Findings

01

Relearn LQR achieves stability guarantees through Lyapunov-based analysis.

02

The method successfully controls aircraft models with static and drifting parameters.

03

Numerical simulations validate the stability and effectiveness of the approach.

Abstract

In this paper, we investigate a data-driven framework to solve Linear Quadratic Regulator (LQR) problems when the dynamics is unknown, with the additional challenge of providing stability certificates for the overall learning and control scheme. Specifically, in the proposed on-policy learning framework, the control input is applied to the actual (unknown) linear system while iteratively optimized. We propose a learning and control procedure, termed Relearn LQR, that combines a recursive least squares method with a direct policy search based on the gradient method. The resulting scheme is analyzed by modeling it as a feedback-interconnected nonlinear dynamical system. A Lyapunov-based approach, exploiting averaging and timescale separation theories for nonlinear systems, allows us to provide formal stability guarantees for the whole interconnected scheme. The effectiveness of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.