Policy Gradient Reinforcement Learning for Uncertain Polytopic LPV   Systems based on MHE-MPC

Hossein Nejatbakhsh Esfahani; Sebastien Gros

arXiv:2206.05089·eess.SY·June 13, 2022

Policy Gradient Reinforcement Learning for Uncertain Polytopic LPV Systems based on MHE-MPC

Hossein Nejatbakhsh Esfahani, Sebastien Gros

PDF

Open Access

TL;DR

This paper introduces a reinforcement learning-based approach to improve Model Predictive Control and Moving Horizon Estimation for uncertain polytopic LPV systems with inexact models, enhancing robustness and performance.

Contribution

It proposes a novel RL framework to jointly learn the estimator and controller for LPV systems with model uncertainties, addressing inexact scheduling parameters.

Findings

01

RL-based MHE/MPC outperforms traditional methods in uncertain LPV systems.

02

The approach effectively estimates states and convex combinations despite model inaccuracies.

03

Demonstrated success on an illustrative example.

Abstract

In this paper, we propose a learning-based Model Predictive Control (MPC) approach for the polytopic Linear Parameter-Varying (LPV) systems with inexact scheduling parameters (as exogenous signals with inexact bounds), where the Linear Time Invariant (LTI) models (vertices) captured by combinations of the scheduling parameters becomes wrong. We first propose to adopt a Moving Horizon Estimation (MHE) scheme to simultaneously estimate the convex combination vector and unmeasured states based on the observations and model matching error. To tackle the wrong LTI models used in both the MPC and MHE schemes, we then adopt a Policy Gradient (PG) Reinforcement Learning (RL) to learn both the estimator (MHE) and controller (MPC) so that the best closed-loop performance is achieved. The effectiveness of the proposed RL-based MHE/MPC design is demonstrated using an illustrative example.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Control Systems Optimization · Cardiovascular Function and Risk Factors