Data-Driven LQR with Finite-Time Experiments via Extremum-Seeking Policy Iteration

Guido Carnevale; Nicola Mimmo; Giuseppe Notarstefano

arXiv:2412.02758·math.OC·June 13, 2025

Data-Driven LQR with Finite-Time Experiments via Extremum-Seeking Policy Iteration

Guido Carnevale, Nicola Mimmo, Giuseppe Notarstefano

PDF

Open Access

TL;DR

This paper introduces EXP-LQR, a novel iterative method for solving LQR problems that relies solely on finite-time experiments and extremum-seeking, without requiring system or cost matrix knowledge.

Contribution

The paper proposes EXP-LQR, an extremum-seeking based iterative algorithm that converges to near-optimal solutions using only finite-time cost approximations.

Findings

01

EXP-LQR converges exponentially to near-optimal gain matrices.

02

The method does not require direct knowledge of system or cost matrices.

03

Numerical simulations validate the theoretical convergence results.

Abstract

In this paper, we address Linear Quadratic Regulator (LQR) problems through a novel iterative algorithm named EXtremum-seeking Policy iteration LQR (EXP-LQR). The peculiarity of EXP-LQR is that it only needs access to a truncated approximation of the infinite-horizon cost associated to a given policy. Hence, EXP-LQR does not need the direct knowledge of neither the system and cost matrices. In particular, at each iteration, EXP-LQR refines the maintained policy using a truncated LQR cost retrieved by performing finite-time virtual or real experiments in which a perturbed version of the current policy is employed. Such a perturbation is done according to an extremum-seeking mechanism and makes the overall algorithm a time-varying nonlinear system. By using a Lyapunov-based approach exploiting averaging theory, we show that EXP-LQR exponentially converges to an arbitrarily small…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExtremum Seeking Control Systems · Advanced Control Systems Optimization · Fault Detection and Control Systems