Data-Driven LQR with Finite-Time Experiments via Extremum-Seeking Policy Iteration
Guido Carnevale, Nicola Mimmo, Giuseppe Notarstefano

TL;DR
This paper introduces EXP-LQR, a novel iterative method for solving LQR problems that relies solely on finite-time experiments and extremum-seeking, without requiring system or cost matrix knowledge.
Contribution
The paper proposes EXP-LQR, an extremum-seeking based iterative algorithm that converges to near-optimal solutions using only finite-time cost approximations.
Findings
EXP-LQR converges exponentially to near-optimal gain matrices.
The method does not require direct knowledge of system or cost matrices.
Numerical simulations validate the theoretical convergence results.
Abstract
In this paper, we address Linear Quadratic Regulator (LQR) problems through a novel iterative algorithm named EXtremum-seeking Policy iteration LQR (EXP-LQR). The peculiarity of EXP-LQR is that it only needs access to a truncated approximation of the infinite-horizon cost associated to a given policy. Hence, EXP-LQR does not need the direct knowledge of neither the system and cost matrices. In particular, at each iteration, EXP-LQR refines the maintained policy using a truncated LQR cost retrieved by performing finite-time virtual or real experiments in which a perturbed version of the current policy is employed. Such a perturbation is done according to an extremum-seeking mechanism and makes the overall algorithm a time-varying nonlinear system. By using a Lyapunov-based approach exploiting averaging theory, we show that EXP-LQR exponentially converges to an arbitrarily small…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExtremum Seeking Control Systems · Advanced Control Systems Optimization · Fault Detection and Control Systems
