Data-Driven Model Predictive Control with Stability and Robustness   Guarantees

Julian Berberich; Johannes K\"ohler; Matthias A. M\"uller; Frank; Allg\"ower

arXiv:1906.04679·eess.SY·April 19, 2021

Data-Driven Model Predictive Control with Stability and Robustness Guarantees

Julian Berberich, Johannes K\"ohler, Matthias A. M\"uller, Frank, Allg\"ower

PDF

TL;DR

This paper introduces a robust data-driven MPC method for linear systems that guarantees stability and robustness without requiring system identification, using only measured trajectories and behavioral systems theory.

Contribution

It presents the first theoretical analysis of stability and robustness guarantees for a simple, purely data-driven MPC scheme without prior system identification.

Findings

01

Proves exponential stability of the nominal scheme without noise.

02

Develops a robust scheme with practical stability under measurement noise.

03

Provides theoretical guarantees for closed-loop properties of data-driven MPC.

Abstract

We propose a robust data-driven model predictive control (MPC) scheme to control linear time-invariant (LTI) systems. The scheme uses an implicit model description based on behavioral systems theory and past measured trajectories. In particular, it does not require any prior identification step, but only an initially measured input-output trajectory as well as an upper bound on the order of the unknown system. First, we prove exponential stability of a nominal data-driven MPC scheme with terminal equality constraints in the case of no measurement noise. For bounded additive output measurement noise, we propose a robust modification of the scheme, including a slack variable with regularization in the cost. We prove that the application of this robust MPC scheme in a multi-step fashion leads to practical exponential stability of the closed loop w.r.t. the noise level. The presented…

Equations243

H_{L}

H_{L}

x_{[a, b]} = x_{a} ⋮ x_{b} .

x_{[a, b]} = x_{a} ⋮ x_{b} .

x_{k + 1}

x_{k + 1}

y_{k}

[H_{L} (u^{d}) H_{L} (y^{d})] α = [\overset{u}{ˉ} \overset{y}{ˉ}] .

[H_{L} (u^{d}) H_{L} (y^{d})] α = [\overset{u}{ˉ} \overset{y}{ˉ}] .

J_{L} (u_{[t - n, t - 1]}, y_{[t - n, t - 1]},

J_{L} (u_{[t - n, t - 1]}, y_{[t - n, t - 1]},

[\overset{u}{ˉ}_{[- n, L - 1]} (t) \overset{y}{ˉ}_{[- n, L - 1]} (t)]

[\overset{u}{ˉ}_{[- n, - 1]} (t) \overset{y}{ˉ}_{[- n, - 1]} (t)]

ℓ (\overset{u}{ˉ}, \overset{y}{ˉ}) = ∥ \overset{u}{ˉ} - u^{s} ∥_{R}^{2} + ∥ \overset{y}{ˉ} - y^{s} ∥_{Q}^{2},

ℓ (\overset{u}{ˉ}, \overset{y}{ˉ}) = ∥ \overset{u}{ˉ} - u^{s} ∥_{R}^{2} + ∥ \overset{y}{ˉ} - y^{s} ∥_{Q}^{2},

J_{L}^{*} (u_{[t - n, t - 1]}

J_{L}^{*} (u_{[t - n, t - 1]}

α (t) \overset{u}{ˉ} (t), \overset{y}{ˉ} (t) min

s . t .

[\overset{u}{ˉ}_{[- n, - 1]} (t) \overset{y}{ˉ}_{[- n, - 1]} (t)] = [u_{[t - n, t - 1]} y_{[t - n, t - 1]}],

[\overset{u}{ˉ}_{[L - n, L - 1]} (t) \overset{y}{ˉ}_{[L - n, L - 1]} (t)] = [u_{n}^{s} y_{n}^{s}],

\overset{u}{ˉ}_{k} (t) \in U, \overset{y}{ˉ}_{k} (t) \in Y, k \in I_{[0, L - 1]} .

J_{L} (x_{t + 1}, α^{'} (t + 1))

J_{L} (x_{t + 1}, α^{'} (t + 1))

= k = 0 \sum L - 1 ℓ (\overset{u}{ˉ}_{k}^{'} (t + 1), \overset{y}{ˉ}_{k}^{'} (t + 1)) = k = 1 \sum L - 1 ℓ (\overset{u}{ˉ}_{k}^{*} (t), \overset{y}{ˉ}_{k}^{*} (t))

= J_{L}^{*} (x_{t}) - ℓ (\overset{u}{ˉ}_{0}^{*} (t), \overset{y}{ˉ}_{0}^{*} (t)) .

J_{L}^{*} (x_{t + 1}) \leq J_{L}^{*} (x_{t}) - ℓ (\overset{u}{ˉ}_{0}^{*} (t), \overset{y}{ˉ}_{0}^{*} (t)) .

J_{L}^{*} (x_{t + 1}) \leq J_{L}^{*} (x_{t}) - ℓ (\overset{u}{ˉ}_{0}^{*} (t), \overset{y}{ˉ}_{0}^{*} (t)) .

W (A x + B u) - W (x) \leq - \frac{1}{2} ∥ x ∥_{2}^{2} + c_{1} ∥ u ∥_{2}^{2} + c_{2} ∥ y ∥_{2}^{2},

W (A x + B u) - W (x) \leq - \frac{1}{2} ∥ x ∥_{2}^{2} + c_{1} ∥ u ∥_{2}^{2} + c_{2} ∥ y ∥_{2}^{2},

V (x) = J_{L}^{*} (x) + γ W (x) \leq (c_{u} + γ λ_{m a x} (P)) ∥ x ∥_{2}^{2},

V (x) = J_{L}^{*} (x) + γ W (x) \leq (c_{u} + γ λ_{m a x} (P)) ∥ x ∥_{2}^{2},

γ = \frac{λ _{m i n} ( Q , R )}{max { c _{1} , c _{2} }} > 0.

γ = \frac{λ _{m i n} ( Q , R )}{max { c _{1} , c _{2} }} > 0.

V (x_{t + 1}) - V (x_{t}) \leq

V (x_{t + 1}) - V (x_{t}) \leq

- ∥ u_{t} ∥_{R}^{2} - ∥ y_{t} ∥_{Q}^{2}

\leq

J_{L}^{*}

J_{L}^{*}

α (t), σ (t) \overset{u}{ˉ} (t), \overset{y}{ˉ} (t) min

s . t .

[\overset{u}{ˉ}_{[- n, - 1]} (t) \overset{y}{ˉ}_{[- n, - 1]} (t)] = [u_{[t - n, t - 1]} \tilde{y}_{[t - n, t - 1]}],

[\overset{u}{ˉ}_{[L - n, L - 1]} (t) \overset{y}{ˉ}_{[L - n, L - 1]} (t)] = [u_{n}^{s} y_{n}^{s}], \overset{u}{ˉ}_{k} (t) \in U,

∥ σ_{k} (t) ∥_{\infty} \leq \overset{ε}{ˉ} (1 + ∥ α (t) ∥_{1}), k \in I_{[0, L - 1]} .

H_{ux} = [H_{L + n} (u^{d}) H_{1} (x_{[0, N - L - n]}^{d})]

H_{ux} = [H_{L + n} (u^{d}) H_{1} (x_{[0, N - L - n]}^{d})]

c_{p e} : = H_{ux}^{†}_{2}^{2} .

c_{p e} : = H_{ux}^{†}_{2}^{2} .

ρ I_{m (L + n)} ⪯ U U^{⊤} ⪯ ν I_{m (L + n)}

ρ I_{m (L + n)} ⪯ U U^{⊤} ⪯ ν I_{m (L + n)}

c_{p e}^{u}

c_{p e}^{u}

= λ_{m a x} (U U^{⊤}) \cdot λ_{m a x} ((U U^{⊤})^{- 1} (U U^{⊤})^{- 1})

\leq \frac{λ _{m a x} ( U U ^{⊤} )}{λ _{m i n} ( U U ^{⊤} ) ^{2}} \leq \eqref e q : a s s_{p} e_{q} u an t i t a t i v e \frac{ν}{ρ ^{2}} .

ξ_{t} : = [u_{[t - n, t - 1]} y_{[t - n, t - 1]}] .

ξ_{t} : = [u_{[t - n, t - 1]} y_{[t - n, t - 1]}] .

\tilde{ξ}_{t} : = [u_{[t - n, t - 1]} \tilde{y}_{[t - n, t - 1]}] = [u_{[t - n, t - 1]} y_{[t - n, t - 1]} + ε_{[t - n, t - 1]}] .

\tilde{ξ}_{t} : = [u_{[t - n, t - 1]} \tilde{y}_{[t - n, t - 1]}] = [u_{[t - n, t - 1]} y_{[t - n, t - 1]} + ε_{[t - n, t - 1]}] .

γ λ_{m i n} (P) ∥ ξ_{t} ∥_{2}^{2} \leq V_{t} \leq c_{3} ∥ ξ_{t} ∥_{2}^{2} + c_{4},

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Data-Driven Model Predictive Control with Stability and Robustness Guarantees

Julian Berberich1, Johannes Köhler1, Matthias A. Müller2, and Frank Allgöwer1 This work was funded by Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy - EXC 2075 - 390740016. The authors thank the International Max Planck Research School for Intelligent Systems (IMPRS-IS) for supporting Julian Berberich, and the International Research Training Group Soft Tissue Robotics (GRK 2198/1).1Julian Berberich, Johannes Köhler, and Frank Allgöwer are with the Institute for Systems Theory and Automatic Control, University of Stuttgart, 70550 Stuttgart, Germany (email: $\{$ julian.berberich, johannes.koehler, frank.allgower $\}$ @ist.uni-stuttgart.de)2Matthias A. Müller is with the Leibniz University Hannover, Institute of Automatic Control, 30167 Hannover, Germany (e-mail:[email protected])

Abstract

We propose a robust data-driven model predictive control (MPC) scheme to control linear time-invariant (LTI) systems. The scheme uses an implicit model description based on behavioral systems theory and past measured trajectories. In particular, it does not require any prior identification step, but only an initially measured input-output trajectory as well as an upper bound on the order of the unknown system. First, we prove exponential stability of a nominal data-driven MPC scheme with terminal equality constraints in the case of no measurement noise. For bounded additive output measurement noise, we propose a robust modification of the scheme, including a slack variable with regularization in the cost. We prove that the application of this robust MPC scheme in a multi-step fashion leads to practical exponential stability of the closed loop w.r.t. the noise level. The presented results provide the first (theoretical) analysis of closed-loop properties, resulting from a simple, purely data-driven MPC scheme.

Index Terms:

Predictive control for linear systems, data-driven control, uncertain systems, robust control.

††publicationid: pubid:

©2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

I Introduction

While data-driven methods for system analysis and control have become increasingly popular over the recent years, only few such methods give theoretical guarantees on, e.g., stability or constraint satisfaction of system variables [1, 2]. A control method, which is naturally well-suited for achieving these objectives is model predictive control (MPC), which can handle nonlinear system dynamics, hard constraints on input, state and output, and it takes performance criteria into account [3]. It centers around the repeated online solution of an optimization problem over predicted future system trajectories. Thus, for the implementation of MPC, a model of the plant is required, which is usually obtained from first principles or from measured data via system identification [4]. An appealing alternative is to implement an MPC controller directly from measured data, without prior knowledge of an accurate model. In various recent works, learning-based or adaptive MPC schemes have been proposed, which improve an inaccurate initial model using online measurements [5, 6, 7, 8, 9], while giving guarantees on the resulting closed loop. Similarly, MPC based on Gaussian Processes has received increasing attraction [10], but proving desirable closed-loop properties remains an open issue. A different approach, which uses linear combinations of past trajectories to predict future trajectories, has been presented in [11], but also no guarantees on, e.g., stability of the closed loop were given. The design of purely data-driven MPC approaches with guarantees on stability and constraint satisfaction thus remains an open problem.

In this paper, we present a novel data-driven MPC scheme to control linear time-invariant (LTI) systems with stability and robustness guarantees for the closed loop. Our approach relies on a result from behavioral systems theory, which shows that the Hankel matrix consisting of a previously measured input-output trajectory spans the vector space of all trajectories of an LTI system, given that the input component is persistently exciting [12]. Although this result has found various applications in the field of system identification [13, 14, 15], it has only recently been used to develop data-driven methods for system analysis and control with theoretical guarantees. An exposition of the main result of [12] in the classical state-space control framework and an extension to certain classes of nonlinear systems are provided in [16]. Further, the result is employed in [17] to design state- and output-feedback controllers and in [18] to verify dissipation inequalities from measured data, whereas [19] investigates data-driven control without requiring persistently exciting data.

Moreover, the recent contributions [20, 21, 22] set up an MPC scheme based on [12], but no guarantees on recursive feasibility or closed-loop stability can be given since neither terminal ingredients are included in the MPC scheme nor sufficient lower bounds on the prediction horizon are derived. In the present paper, we propose a related MPC scheme, which utilizes terminal equality constraints, and we provide a theoretical analysis of various desirable properties of the closed loop. To the best of our knowledge, this is the first analysis regarding recursive feasibility and stability of purely data-driven MPC. The main advantage of the proposed MPC scheme over existing adaptive or learning-based methods such as [5, 6, 7, 8, 9] is that it requires only an initially measured, persistently exciting data trajectory as well as an upper bound on the system order, but no (set-based) model description and no online estimation process. Moreover, since it relies on the data-driven system description from [12], the presented scheme is inherently an output-feedback MPC scheme and does not require online state measurements.

After stating the required definitions and existing results in Section II, we expand the nominal MPC scheme of [20, 21] by terminal equality constraints in Section III. Under the assumption that the output of the plant can be measured exactly, we prove recursive feasibility, constraint satisfaction, and exponential stability of the scheme. In Section IV, we propose a robust data-driven MPC scheme to account for bounded additive noise in both the initial data for prediction as well as the online measurements. Under suitable assumptions on the system and design parameters, we prove that the closed loop under application of the scheme in a multi-step fashion leads to a practically exponentially stable closed loop. In Section V, we illustrate the advantages of the proposed scheme over the scheme without terminal constraints from [20, 21, 22] by means of a numerical example. The paper is concluded in Section VI.

II Preliminaries

Let $\mathbb{I}_{[a,b]}$ denote the set of integers in the interval $[a,b]$ . For a vector $x$ and a positive definite matrix $P=P^{\top}\succ 0$ , we write $\lVert x\rVert_{P}=\sqrt{x^{\top}Px}$ . Further, we denote the minimal and maximal eigenvalue of $P$ by $\lambda_{\min}(P)$ and $\lambda_{\max}(P)$ , respectively. For two matrices $P_{1}=P_{1}^{\top},P_{2}=P_{2}^{\top}$ , we write $\lambda_{\min}(P_{1},P_{2})=\min\{\lambda_{\min}(P_{1}),\lambda_{\min}(P_{2})\}$ , and similarly for $\lambda_{\max}(P_{1},P_{2})$ . Moreover, $\lVert x\rVert_{2}$ , $\lVert x\rVert_{1}$ , and $\lVert x\rVert_{\infty}$ denote the Euclidean, $\ell_{1}$ -, and $\ell_{\infty}$ -norm of $x$ , respectively. If the argument is matrix-valued, then we mean the corresponding induced norm. For $\delta>0$ , we define $\mathbb{B}_{\delta}=\left\{x\in\mathbb{R}^{n}\mid\lVert x\rVert_{2}\leq\delta\right\}$ . A sequence $\{x_{k}\}_{k=0}^{N-1}$ induces the Hankel matrix

[TABLE]

For a stacked window of the sequence, we write

[TABLE]

We denote by $x$ either the sequence itself or the stacked vector $x_{[0,N-1]}$ containing all of its components. We consider the following standard definition of persistence of excitation.

Definition 1.

We say that a sequence $\{u_{k}\}_{k=0}^{N-1}$ with $u_{k}\in\mathbb{R}^{m}$ is persistently exciting of order $L$ if $\text{rank}(H_{L}(u))=mL$ .

Our goal is to control an unknown LTI system, denoted by $G$ , of order $n$ with $m$ inputs and $p$ outputs, using only measured input-output data.

Definition 2.

We say that an input-output sequence $\{u_{k},y_{k}\}_{k=0}^{N-1}$ is a trajectory of an LTI system $G$ , if there exists an initial condition $\bar{x}\in\mathbb{R}^{n}$ as well as a state sequence $\{x_{k}\}_{k=0}^{N}$ such that

[TABLE]

for $k=0,\dots,N-1$ , where $(A,B,C,D)$ is a minimal realization of $G$ .

Note that we define a trajectory of an LTI system as an input-output sequence that can be produced by a minimal realization, entailing controllability and observability of the system. Extending the results of this paper to systems whose input-output behavior cannot be explained via a minimal realization is an interesting issue for future research. The following result lays the foundation of the present paper. It shows that a Hankel matrix, involving a single persistently exciting trajectory, spans the vector space of all system trajectories of an LTI system. The result originates from behavioral systems theory [12], but we employ the formulation in the classical state-space control framework [16].

Theorem 1 ([16]).

Suppose $\{u_{k}^{d},y_{k}^{d}\}_{k=0}^{N-1}$ is a trajectory of an LTI system $G$ , where $u^{d}$ is persistently exciting of order $L+n$ . Then, $\{\bar{u}_{k},\bar{y}_{k}\}_{k=0}^{L-1}$ is a trajectory of $G$ if and only if there exists $\alpha\in\mathbb{R}^{N-L+1}$ such that

[TABLE]

Recently, Theorem 1 has received increasing attention to develop data-driven controllers [17], verify dissipativity [18], or to design MPC schemes [20, 21, 22]. This is due to the fact that (1) provides an appealing data-driven characterization of all trajectories of the unknown LTI system, without requiring any prior identification step. In this paper, we use Theorem 1 to develop a data-driven MPC scheme with provable stability guarantees despite noisy measurements. Note that, if a sequence is persistently exciting of order $L$ , then it is also persistently exciting of order $\tilde{L}$ for any $\tilde{L}\leq L$ . Therefore, Theorem 1 and hence all of our results hold true if $n$ is replaced by a (potentially rough) upper bound.

Although we assume that only input-output data of the unknown system are available, we make extensive use of the fact that an input-output trajectory of length greater than or equal to $n$ induces a unique internal state in some minimal realization of the unknown system. We employ MPC to stabilize a desired equilibrium of the system. Since a model of this system is not available, we define an equilibrium via input-output pairs.

Definition 3.

We say that an input-output pair $(u^{s},y^{s})\in\mathbb{R}^{m+p}$ is an equilibrium of an LTI system $G$ , if the sequence $\{\bar{u}_{k},\bar{y}_{k}\}_{k=0}^{n}$ with $(\bar{u}_{k},\bar{y}_{k})=(u^{s},y^{s})$ for all $k\in\mathbb{I}_{[0,n]}$ is a trajectory of $G$ .

For an equilibrium $(u^{s},y^{s})$ , we define $u^{s}_{n}$ and $y^{s}_{n}$ as the column vectors containing $n$ times $u^{s}$ and $y^{s}$ , respectively. We assume that the system is subject to pointwise-in-time input and output constraints, i.e., $u_{t}\in\mathbb{U}\subseteq\mathbb{R}^{m}$ , $y_{t}\in\mathbb{Y}\subseteq\mathbb{R}^{p}$ for all $t\geq 0$ , and we assume $(u^{s},y^{s})\in\text{int}(\mathbb{U}\times\mathbb{Y})$ . Throughout this paper, $\left\{u_{k}^{d},y_{k}^{d}\right\}_{k=0}^{N-1}$ denotes an a priori measured data trajectory of length $N$ , which is used for prediction as in (1). The predicted input- and output-trajectories at time $t$ over some prediction horizon $L$ are written as $\left\{\bar{u}_{k}(t),\bar{y}_{k}(t)\right\}_{k=-n}^{L-1}$ . Note that the time indices start at $k=-n$ , since the last $n$ inputs and outputs will be used to invoke a unique initial state at time $t$ . Further, the closed-loop input, the state in some minimal realization, and the output at time $t$ are denoted by $u_{t}$ , $x_{t}$ , and $y_{t}$ , respectively.

III Nominal data-driven MPC

In this section, we propose a simple, nominal data-driven MPC scheme with terminal equality constraints. The scheme relies on noise-free measurements to predict future trajectories using Theorem 1 and is described in Section III-A. Under mild assumptions, we prove recursive feasibility, constraint satisfaction, and exponential stability of the closed loop in Section III-B.

III-A Nominal MPC scheme

Commonly, MPC relies on a model of the plant to predict future trajectories and to optimize over them. Theorem 1 provides an appealing alternative to a model since (1) suffices to capture all system trajectories. Thus, to implement a data-driven MPC scheme, one can simply replace the system dynamics constraint by the constraint that the predicted input-output trajectories satisfy (1). To be more precise, the proposed data-driven MPC scheme minimizes, at time $t$ , given the last $n$ input-output pairs, the following open-loop cost

[TABLE]

As described above, the constraint (2b) replaces the system dynamics compared to classical model-based MPC schemes. Further, (2c) ensures that the internal state of the true trajectory aligns with the internal state of the predicted trajectory at time $t$ . Note that the overall length of the trajectory $(\bar{u}(t),\bar{y}(t))$ is $L+n$ since the past $n$ elements $\{\bar{u}_{k}(t),\bar{y}_{k}(t)\}_{k=-n}^{-1}$ are used to specify the initial conditions in (2c). These initial conditions are specified until time step $t-1$ , since the input at time $t$ might already influence the output at time $t$ , in case of a feedthrough-element of the plant. The open-loop cost depends only on the decision variable $\alpha(t)$ , since $\bar{u}(t)$ and $\bar{y}(t)$ are fixed implicitly through the dynamic constraint (2b). Throughout the paper, we consider quadratic stage costs, which penalize the distance w.r.t. a desired equilibrium $(u^{s},y^{s})$ , i.e.,

[TABLE]

where $Q,R\succ 0$ . In [20, 21], it was suggested to directly minimize the above open-loop cost subject to constraints on input and output. It is well-known that MPC without terminal constraints requires a sufficiently long prediction horizon to ensure stability and constraint satisfaction [23, 24]. Without such an assumption, the application of MPC can even destabilize an open-loop stable system. There are two main approaches in the literature to guarantee stability: a) providing bounds on the minimal required prediction horizon [24] and b) including terminal ingredients such as terminal cost functions or terminal region constraints [25]. Both approaches are usually based on model knowledge and thus, it is not straightforward to use them in the present, purely data-driven setting.

In this paper, we consider a simple terminal equality constraint, which can be directly included into the data-driven MPC framework, and which guarantees exponential stability of the closed loop. To this end, we propose the following data-driven MPC scheme with a terminal equality constraint.

[TABLE]

The terminal equality constraint (3d) implies that $\bar{x}_{L}(t)$ , which is the internal state predicted $L$ steps ahead corresponding to the predicted input-output trajectory, aligns with the steady-state $x^{s}$ corresponding to $(u^{s},y^{s})$ , i.e., $\bar{x}_{L}(t)=x^{s}$ in any minimal realization. While Problem (3) requires that $(u^{s},y^{s})$ is an equilibrium of the unknown system in the sense of Definition 3, this requirement can be dropped when $(u^{s},y^{s})$ is replaced by an artificial equilibrium, which is also optimized online (compare [26]). The recent paper [27] extends the above MPC scheme to such a setting, thereby leading to a significantly larger region of attraction for the closed loop without requiring knowledge of a reachable equilibrium of the unknown system. As in standard MPC, Problem (3) is solved in a receding horizon fashion, which is summarized in Algorithm 1.

With slight abuse of notation, we will denote the open-loop cost and the optimal open-loop cost of (3) by $J_{L}(x_{t},\alpha(t))$ and $J_{L}^{*}(x_{t})$ , respectively, where $x_{t}$ is the state in some minimal realization, induced by $u_{[t-n,t-1]}$ , $y_{[t-n,t-1]}$ .

III-B Closed-loop guarantees

Without loss of generality, we assume for the analysis that $u^{s}=0$ , $y^{s}=0$ , and thus $x^{s}=0$ . Further, we define the set of initial states, for which (3) is feasible, by $\mathbb{X}_{L}=\left\{x\in\mathbb{R}^{n}\mid J_{L}^{*}(x)<\infty\right\}$ . To prove exponential stability of the proposed scheme, we assume that the optimal value function of (3) is quadratically upper bounded. This is, e.g., satisfied in the present linear-quadratic setting if the constraints are polytopic111While [28] considered model-based linear-quadratic MPC, the result applies similarly to the present data-driven MPC setting since (3b) (together with the initial conditions (3c)) describes the input-output behavior of the system exactly and thus, both settings are equivalent in the nominal case. [28].

Assumption 1.

The optimal value function $J_{L}^{*}(x)$ is quadratically upper bounded on $\mathbb{X}_{L}$ , i.e., there exists $c_{u}>0$ such that $J_{L}^{*}(x)\leq c_{u}\lVert x\rVert_{2}^{2}$ for all $x\in\mathbb{X}_{L}$ .

Moreover, we assume that the input $u^{d}$ generating the data used for prediction is sufficiently rich in the following sense.

Assumption 2.

The input $u^{d}$ of the data trajectory is persistently exciting of order $L+2n$ .

Note that we assume persistence of excitation of order $L+2n$ , although Theorem 1 requires only an order of $L+n$ . This is due to the fact that the reconstructed trajectories in (3) are of length $L+n$ (compared to length $L$ in Theorem 1), since $n$ components are used to fix the initial conditions. Furthermore, due to the terminal constraints (3d), the prediction horizon needs to be at least as long as the system order $n$ .

Assumption 3.

The prediction horizon satisfies $L\geq n$ .

The following result shows that the MPC scheme based on (3) is recursively feasible, ensures constraint satisfaction, and leads to an exponentially stable closed loop.

Theorem 2.

Suppose Assumptions 1, 2 and 3 are satisfied. If the MPC problem (3) is feasible at initial time $t=0$ , then

(i)

it is feasible at any $t\in\mathbb{N}$ ,

(ii)

the closed loop satisfies the constraints, i.e., $u_{t}\in\mathbb{U}$ and $y_{t}\in\mathbb{Y}$ for all $t\in\mathbb{N}$ ,

(iii)

the equilibrium $x^{s}=0$ is exponentially stable for the resulting closed loop.

Proof.

Recursive feasibility (i) and constraint satisfaction (ii) follow from standard MPC arguments, i.e., by defining a candidate solution as the shifted, previously optimal solution and appending zero (compare [3]).

**(iii). Exponential Stability

**Denote the standard candidate solution mentioned above by $\bar{u}^{\prime}(t+1),\bar{y}^{\prime}(t+1),\alpha^{\prime}(t+1)$ . The cost of this solution is

[TABLE]

Hence, it holds that

[TABLE]

Since $x$ is the state of an observable (and hence detectable) minimal realization, there exists a matrix $P\succ 0$ such that $W(x)=\lVert x\rVert_{P}^{2}$ is an input-output-to-state stability (IOSS) Lyapunov function222Note that, in [29, Section 3.2], only strictly proper systems with $y=Cx$ are considered, while we allow for more general systems with $y=Cx+Du$ . The result from [29] can be extended to $y=Cx+Du$ by considering a modified $\tilde{B}=B+LD$ in [29, Inequality (12)]., which satisfies

[TABLE]

for all $x\in\mathbb{R}^{n},u\in\mathbb{R}^{m},y=Cx+Du$ , and for suitable $c_{1},c_{2}>0$ [29]. Define the candidate Lyapunov function $V(x)=\gamma W(x)+J_{L}^{*}(x)$ for some $\gamma>0$ . Note that $V$ is quadratically lower bounded, i.e., $V(x)\geq\gamma W(x)\geq\gamma\lambda_{\min}(P)\lVert x\rVert_{2}^{2}$ for all $x\in\mathbb{X}_{L}$ . Further, $J_{L}^{*}$ is quadratically upper bounded by Assumption 1, i.e., $J_{L}^{*}(x)\leq c_{u}\lVert x\rVert_{2}^{2}$ for all $x\in\mathbb{X}_{L}$ . Hence, we have

[TABLE]

for all $x\in\mathbb{X}_{L}$ , i.e., $V$ is quadratically upper bounded. We consider now

[TABLE]

Along the closed-loop trajectories, using both (4) as well as (5), it holds that

[TABLE]

It follows from standard Lyapunov arguments with Lyapunov function $V$ that the equilibrium $x^{s}=0$ is exponentially stable with region of attraction $\mathbb{X}_{L}$ . ∎

The proof of Theorem 2 applies standard arguments from model-based MPC with terminal constraints (compare [3]) to the data-driven system description derived in [12], similar to the approaches of [20, 21] which did however not address closed-loop guarantees. To handle the fact that the stage cost $\ell$ is merely positive semi-definite in the state, detectability of the stage cost is exploited via an IOSS Lyapunov function [29], similar to [30]. As we will see in Section IV, this analogy between model-based MPC and the proposed data-driven MPC scheme is only present in the nominal case, where the data is noise-free. For the more realistic case of noisy output measurements, we develop a robust data-driven MPC scheme and we provide a novel theoretical analysis of the closed loop in Section IV, which is the main contribution of this paper.

Remark 1.

We would like to emphasize the simplicity of the proposed MPC scheme. Without any prior identification step, a single measured data trajectory can be used directly to set up an MPC scheme for a linear system. Compared to other learning-based MPC approaches such as [5, 6, 7, 8, 9], which require initial model knowledge as well as an online estimation process, the complexity of (3) is similar to classical MPC schemes, which rely on full model knowledge. To be more precise, the decision variables $\bar{u}(t),\bar{y}(t)$ can be replaced by $\alpha(t)$ via (3b) (using a condensed formulation) and hence, since $\alpha(t)\in\mathbb{R}^{N-L-n+1}$ , Problem (3) contains in total $N-L-n+1$ decision variables. For $u^{d}$ to be persistently exciting of order $L+2n$ , it needs to hold that $N-L-2n+1\geq m(L+2n)$ . Assuming equality, Problem (3) hence has $m(L+2n)+n$ free parameters. On the contrary, a condensed model-based MPC optimization problem contains $mL$ decision variables for the input trajectory (assuming that state measurements are available). Thus, the online complexity of the proposed data-driven MPC approach is slightly larger ( $2mn+n$ additional decision variables) than that of model-based MPC, but it does not require an a priori (offline) identification step. It is worth noting that the difference in complexity is independent of the horizon $L$ . Moreover, the proposed data-driven MPC is inherently an output-feedback controller since no state measurements are required for its implementation. Finally, as in model-based MPC, for convex polytopic (or quadratic) constraints $\mathbb{U},\mathbb{Y}$ , (3) is a convex (quadratically constrained) quadratic program which can be solved efficiently.

IV Robust data-driven MPC

In this section, we propose a multi-step robust data-driven MPC scheme and we prove practical exponential stability of the closed loop in the presence of bounded additive output measurement noise. The scheme includes a slack variable, which is regularized in the cost and compensates noise both in the initial data $(u^{d},y^{d})$ used for prediction and in the online measurement updates $\left(u_{[t-n,t-1]},y_{[t-n,t-1]}\right)$ . Section IV-A contains the scheme, which is essentially a robust modification of the nominal scheme of Section III, as well as detailed explanations of the key ingredients. In Sections IV-B and IV-C, we prove two technical Lemmas, which will be required for our main theoretical results. Recursive feasibility of the closed loop is proven in Section IV-D. In Section IV-E, we show that, under suitable assumptions, the closed loop resulting from the application of the multi-step MPC scheme leads to a practically exponentially stable closed loop. Moreover, if the noise bound tends to zero, then the region of attraction of the closed loop approaches the set of all initially feasible points. In this section, we do not consider output constraints, i.e., $\mathbb{Y}=\mathbb{R}^{p}$ . In [31], we recently extended the results of this section by incorporating tightened output constraints in order to guarantee closed-loop constraint satisfaction despite noisy data.

IV-A Robust MPC scheme

In practice, the output of the unknown LTI system $G$ is usually not available exactly, but might be subject to measurement noise. This implies that the stacked data-dependent Hankel matrices in (1) do not span the system’s trajectory space exactly and thus, the output trajectories cannot be predicted accurately. Moreover, noisy output measurements enter the initial conditions in Problem (3), which deteriorates the prediction accuracy even further. Therefore, a direct application of the MPC scheme of Section III may lead to feasibility issues or it may render the closed loop unstable. In this section, we tackle the issue of noisy measurements with a robust data-driven MPC scheme with terminal constraints. We consider output measurements with bounded additive noise in the initially available data $\tilde{y}_{k}^{d}=y_{k}^{d}+\varepsilon_{k}^{d}$ as well as in the online measurements $\tilde{y}_{k}=y_{k}+\varepsilon_{k}$ . We make no assumptions on the nature of the noise, but we require that it is bounded as $\lVert\varepsilon_{k}^{d}\rVert_{\infty}\leq\bar{\varepsilon}$ and $\lVert\varepsilon_{k}\rVert_{\infty}\leq\bar{\varepsilon}$ for some $\bar{\varepsilon}>0$ . Thus, the present setting includes two types of noise. The data used for the prediction via the Hankel matrices in (1) is perturbed by $\varepsilon^{d}$ , which can thus be interpreted as a multiplicative model uncertainty. On the other hand, $\varepsilon$ perturbs the online measurements and hence, the overall control goal is a noisy output-feedback problem.

The key idea to account for noisy measurements is to relax the equality constraint (3b), where the relaxation parameter is penalized appropriately in the cost function. Given a noisy initial input-output trajectory $\left(u_{[t-n,t-1]},\tilde{y}_{[t-n,t-1]}\right)$ of length $n$ , and noisy data $(u^{d},\tilde{y}^{d})$ , we propose the following robust modification of (3).

[TABLE]

Compared to the nominal MPC problem (3), the output data trajectory $\tilde{y}^{d}$ as well as the initial output $\tilde{y}_{[t-n,t-1]}$ , which is obtained via online measurements, have been replaced by their noisy counterparts. Further, the following ingredients have been added:

a)

A slack variable $\sigma$ , bounded by (6d), to account for the noisy online measurements $\tilde{y}_{[t-n,t-1]}$ and for the noisy data $\tilde{y}^{d}$ used for prediction, which can be interpreted as a multiplicative model uncertainty,

b)

Quadratic regularization (i.e., ridge regularization) of $\alpha$ and $\sigma$ with weights $\lambda_{\alpha}\bar{\varepsilon},\lambda_{\sigma}>0$ , i.e., the regularization of $\alpha$ depends on the noise level.

The above $\ell_{2}$ -norm regularization for $\alpha(t)$ implies that small values of $\lVert\alpha(t)\rVert_{2}^{2}$ are preferred. Since the noisy Hankel matrix $H_{L+n}\left(\tilde{y}^{d}\right)$ is multiplied by $\alpha(t)$ in (6a), this implicitly reduces the influence of the noise on the prediction accuracy. Intuitively, for increasing $\lambda_{\alpha}$ , the term $\lambda_{\alpha}\bar{\varepsilon}\lVert\alpha(t)\rVert_{2}^{2}$ reduces the “complexity” of the data-driven system description (6a), similar to regularization methods in linear regression, thus allowing for a tradeoff between tracking performance and the avoidance of overfitting. The term $\lambda_{\sigma}\lVert\sigma(t)\rVert_{2}^{2}$ yields small values for the slack variable $\sigma(t)$ , thus improving the prediction accuracy. For our theoretical results, $\lambda_{\sigma}$ can be chosen to be zero since $\sigma(t)$ is already rendered small by the constraint (6d). However, as we discuss in more detail in Remark 3, the constraint (6d) is non-convex but can be neglected if $\lambda_{\sigma}$ is large enough.

An alternative to the present regularization terms are general quadratic regularization kernels, i.e., costs of the form $\lVert\alpha(t)\rVert_{P_{\alpha}}^{2}$ , $\lVert\sigma(t)\rVert_{P_{\sigma}}^{2}$ for suitable matrices $P_{\alpha},P_{\sigma}\succ 0$ . Further, in [21, 22], $\ell_{1}$ -regularizations of $\alpha$ and $\sigma$ were suggested and the resulting MPC scheme, without terminal equality constraints, was successfully applied to a nonlinear stochastic control problem. However, theoretical guarantees on closed-loop stability were not given. Throughout this paper, we consider simple quadratic penalty terms since this simplifies the arguments, but we conjecture that our theoretical results remain to hold for general norms $\lVert\alpha(t)\rVert_{p},\lVert\sigma(t)\rVert_{q}$ with arbitrary $p,q=1,\dots,\infty$ . An interesting open question, which is beyond the scope of this paper, is to investigate the impact of particular choices of regularization norms on the practical performance of the presented MPC approach. The choice of norms in the constraint (6d) is independent of the norms in the cost and essentially follows from the $\ell_{\infty}$ -noise bound and the proofs of the value function upper bound (Lemma 1) and recursive feasibility (Proposition 1).

In this section, we study the closed loop resulting from an application of (6) in an $n$ -step MPC scheme (compare [32, 33]). To be more precise, we consider the scenario that, after solving (6) online, the first $n$ computed inputs are applied to the system. Thereafter, the horizon is shifted by $n$ steps, before the whole scheme is repeated (compare Algorithm 2).

As we will see in the remainder of this section, for the considered setting with output measurement noise, the multi-step MPC scheme described in Algorithm 2 has superior theoretical properties compared to its corresponding $1$ -step version. This is mainly due to the terminal equality constraints (6c), which complicate the proof of recursive feasibility, similar as in model-based robust MPC with terminal equality constraints and model mismatch. In particular, we show in this section that, for an $n$ -step MPC scheme with a terminal equality constraint, practical exponential stability can be proven. On the other hand, we comment on the differences for the corresponding $1$ -step MPC scheme in Section IV-D (Remark 4). In particular, for a $1$ -step MPC scheme relying on (6), recursive feasibility holds only locally around $(u^{s},y^{s})$ and thus, only local stability can be guaranteed. Nevertheless, as we will see in Section V for a numerical example, the practical performance of the $n$ -step scheme is almost indistinguishable from the $1$ -step scheme.

Remark 2.

In the nominal case of Section III, i.e., for $\bar{\varepsilon}=0$ , (6d) implies $\sigma=0$ . Further, the regularization of $\alpha$ vanishes for $\bar{\varepsilon}=0$ , and the system dynamics (6a) as well as the initial conditions (6b) approach their nominal counterparts. Thus, for $\bar{\varepsilon}=0$ , Problem (6) reduces to the nominal Problem (3).

Remark 3.

If the constraint (6d) is neglected and the input constraint set $\mathbb{U}$ is a convex polytope, then Problem (6) is a strictly convex quadratic program and can be solved efficiently. However, the constraint on the slack variable $\sigma$ in (6d) is non-convex due to the dependence of the right-hand side on $\lVert\alpha(t)\rVert_{1}$ , making it difficult to implement (6) in an efficient way. As will become clear later in this section, (6d) is required to prove recursive feasibility and practical exponential stability. It may, however, be replaced by the (convex) constraint $\lVert\sigma_{k}(t)\rVert_{\infty}\leq c\cdot\bar{\varepsilon}$ for a sufficiently large constant $c>0$ , retaining the same theoretical guarantees. Generally, a larger choice of $c$ increases the region of attraction, but also the size of the exponentially stable set to which the closed loop converges. Furthermore, the constraint (6d) can be enforced implicitly by choosing $\lambda_{\sigma}$ large enough. In simulation examples, it was observed that the constraint (6d) is usually satisfied (for suitably large choices of $\lambda_{\sigma}$ ) without enforcing it explicitly in the optimization problem and thus, it may in most cases be neglected in the online optimization.

As in the previous section, we require that the measured input $u^{d}$ is persistently exciting of order $L+2n$ (Assumption 2). Further, to establish a local upper bound on the optimal cost of (6) and to prove recursive feasibility, we require that the horizon $L$ is not shorter than twice the system’s order, as captured in the following assumption.

Assumption 4.

The prediction horizon satisfies $L\geq 2n$ .

In some minimal realization, we denote the state trajectory corresponding to $(u^{d},y^{d})$ by $x^{d}$ . According to [12, Corollary 2], Assumption 2 implies that the matrix

[TABLE]

has full row rank and thus admits a right-inverse $H_{ux}^{\dagger}=H_{ux}^{\top}\left(H_{ux}H_{ux}^{\top}\right)^{-1}$ . Define the quantity

[TABLE]

For our stability results, we will require that $c_{pe}\bar{\varepsilon}$ is bounded from above by a sufficiently small number. Essentially, this corresponds to a quantitative “persistence-of-excitation-to-noise”-bound. To be more precise, abbreviate in the following $U=H_{L+n}(u^{d})$ and suppose that

[TABLE]

for scalar constants $\rho,\nu>0$ . Further, define the quantity $c_{pe}^{u}=\lVert U^{\dagger}\rVert_{2}^{2}=\lVert U^{\top}(UU^{\top})^{-1}\rVert_{2}^{2}$ . Then, it holds that

[TABLE]

Thus, if a persistently exciting input $u^{d}$ is multiplied by a constant $c>1$ , then $c_{pe}^{u}$ decreases proportionally to $\frac{1}{c^{2}}$ . Further, the constant $\rho$ can typically be chosen larger if the data length $N$ increases. The same arguments can be carried out when assuming a bound of the form (9) for the matrix (7), but finding a suitable input which generates data achieving such a bound is less obvious. It is well-known for classical definitions of persistence of excitation that larger excitation of the input implies larger excitation of the state. Therefore, we conjecture (and we have observed for various practical simulation examples) that $c_{pe}$ decreases with increasing data horizons $N$ and with multiplications of a persistently exciting input data trajectory $u^{d}$ by a scalar constant greater than one. This means that, for a given noise level $\bar{\varepsilon}$ , robust stability as guaranteed in the following sections can be obtained by choosing a large enough persistently exciting input $u^{d}$ and/or a sufficiently large data horizon $N$ .

Similar to Section III, we denote the open-loop cost of the robust MPC problem (6) by $J_{L}\left(u_{[t-n,t-1]},\tilde{y}_{[t-n,t-1]},\alpha(t),\sigma(t)\right)$ , and the optimal cost by $J_{L}^{*}\left(u_{[t-n,t-1]},\tilde{y}_{[t-n,t-1]}\right)$ . Moreover, we assume for the analysis that $(u^{s},y^{s})=(0,0)$ . For the presented robust data-driven MPC scheme, setpoints $(u^{s},y^{s})\neq(0,0)$ change mainly one quantitative constant in Lemma 1. We comment on the main differences in the case $(u^{s},y^{s})\neq(0,0)$ in Section IV-D (Remark 5).

IV-B Local upper bound of Lyapunov function

In this section, we show that the optimal cost of (6) admits a quadratic upper bound, similar to the nominal case (cf. Assumption 1). It is straightforward to see that such an upper bound can not be quadratic in the state $x$ of some minimal realization: the optimal cost $J_{L}^{*}$ depends explicitly on $\alpha^{*}(t)$ via $\lambda_{\alpha}\bar{\varepsilon}\lVert\alpha^{*}(t)\rVert_{2}^{2}$ , which in turn depends on the past $n$ inputs and outputs $(u_{[t-n,t-1]},y_{[t-n,t-1]})$ through (6a) and (6b). Even if the current state is zero, i.e., $x_{t}=0$ , these may in general be arbitrarily large and hence, $\alpha$ and therefore also $J_{L}^{*}$ may be arbitrarily large. Thus, $J_{L}^{*}$ does not admit an upper bound in the state $x_{t}$ of a minimal realization. To overcome this issue, we consider a different (not minimal) state of the system, defined as

[TABLE]

Further, we define the noisy version of $\xi$ as

[TABLE]

Denote the (not invertible) linear transformation from $\xi$ to an arbitrary but fixed state $x$ in some minimal realization by $T$ , i.e., $x_{t}=T\xi_{t}$ . Clearly, this implies $\lVert x_{t}\rVert_{2}^{2}\leq\lVert T\rVert_{2}^{2}\lVert\xi_{t}\rVert_{2}^{2}\eqqcolon\Gamma_{x}\lVert\xi_{t}\rVert_{2}^{2}$ . Note that $\xi$ is the state of a detectable state-space realization and thus, there exists an IOSS Lyapunov function $W(\xi)=\lVert\xi\rVert_{P}^{2}$ , similar to the proof of Theorem 2. For some $\gamma>0$ , define $V_{t}\coloneqq J_{L}^{*}(\tilde{\xi}_{t})+\gamma W(\xi_{t})$ . The following result shows that, for the state $\xi$ , a meaningful quadratic upper bound on $V$ can be proven.

Lemma 1.

Suppose Assumptions 2 and 4 hold. Then, there exists a constant $c_{3}>0$ as well as a $\delta>0$ such that, for all $\xi_{t}\in\mathbb{B}_{\delta}$ , Problem (6) is feasible and $V$ is bounded as

[TABLE]

where $c_{4}=2np\bar{\varepsilon}^{2}\lambda_{\sigma}$ .

Proof.

The lower bound is trivial. For the upper bound, we construct a feasible candidate solution to Problem (6) which brings the state $x$ in some minimal realization (and thus the output $y$ ) to zero in $L$ steps. Obviously, we have $\bar{u}_{[-n,-1]}(t)=u_{[t-n,t-1]}$ as well as $\bar{y}_{[-n,-1]}(t)=\tilde{y}_{[t-n,t-1]}$ by (6b). By assumption, we have $L\geq 2n$ as well as $0\in\text{int}(\mathbb{U})$ . Thus, by controllability, there exists a $\delta>0$ such that for any $x_{t}$ with $\frac{1}{\Gamma_{x}}\lVert x_{t}\rVert_{2}\leq\lVert\xi_{t}\rVert_{2}\leq\delta$ , there exists an input trajectory $u_{[t,t+L-1]}\in\mathbb{U}^{L}$ , which brings the state $x_{[t,t+L-1]}$ and the corresponding output $y_{[t,t+L-1]}$ to the origin in $L-n$ steps while satisfying

[TABLE]

for a suitable constant $\Gamma_{uy}>0$ . As candidate input-output trajectories for (6), we choose these $u,y$ , i.e., $\bar{u}_{[0,L-1]}(t)=u_{[t,t+L-1]},\bar{y}_{[0,L-1]}(t)=y_{[t,t+L-1]}$ . Moreover, $\alpha(t)$ is chosen as

[TABLE]

where $H_{ux}$ is defined in (7). As is described in more detail in [15, 16], the output of an LTI system is a linear combination of its initial condition and the input, and therefore, the above choice of $\alpha(t)$ implies

[TABLE]

where $\varepsilon_{[t-n,t-1]}$ is the true noise instance. For the slack variable $\sigma$ , we choose

[TABLE]

which implies that (6a)-(6c) are satisfied. Finally, writing $e_{i}$ for a row vector whose $i$ -th component is equal to $1$ and which is zero otherwise, we obtain

[TABLE]

This implies $\lVert\sigma(t)\rVert_{\infty}\leq\bar{\varepsilon}\left(\lVert\alpha(t)\rVert_{1}+1\right)$ , which in turn proves that (6d) is satisfied.

In the following, we employ the above candidate solution to bound the optimal cost and thereby, the function $V$ . Due to observability of the pair $(A,C)$ , corresponding to the minimal realization with state $x$ , it holds that

[TABLE]

where $\Phi_{\dagger}=(\Phi^{\top}\Phi)^{-1}\Phi^{\top}$ is a left-inverse of the observability matrix $\Phi$ . The lower block of (16) follows from observability and the linear system dynamics $x_{k+1}=Ax_{k}+Bu_{k},\>y_{k}=Cx_{k}+Du_{k}$ for $k\in\mathbb{I}_{[t-n,t-1]}$ , which can be used to compute the matrix $M_{1}$ depending on $A,B,C,D$ . Hence, $\alpha(t)$ can be bounded as

[TABLE]

Using standard norm equivalence properties, it holds for arbitrary $k\in\mathbb{N}$ that

[TABLE]

where $c_{5}\coloneqq p(N-L-n+1)$ . Based on the definition of $\sigma(t)$ in (14), and using (18) as well as the inequality $(a+b)^{2}\leq 2(a^{2}+b^{2})$ , we can bound $\sigma(t)$ in terms of $\alpha(t)$ as

[TABLE]

Combining the above inequalities, $V$ is upper bounded as

[TABLE]

Finally, $x_{t}$ is bounded by $\xi_{t}$ as $\lVert x_{t}\rVert_{2}^{2}\leq\Gamma_{x}\lVert\xi_{t}\rVert_{2}^{2}$ , which leads to $V_{t}\leq c_{3}\lVert\xi_{t}\rVert_{2}^{2}+c_{4}$ , where

[TABLE]

∎

In Section III, we assumed that the optimal cost is quadratically upper bounded (cf. Assumption 1), which is not restrictive in the nominal linear-quadratic setting. Lemma 1 proves that, under mild assumptions, the optimal cost of the robust MPC problem (6) admits (locally) a similar upper bound and can thus be seen as the robust counterpart of Assumption 1.

The term $c_{4}$ is solely due to the slack variable $\sigma$ . This can be explained by noting that, for $\xi_{t}=0$ , $\alpha(t)$ , $\bar{u}_{[0,L-1]}(t)$ , $\bar{y}_{[0,L-1]}(t)$ can all be chosen to be zero, as long as $\sigma$ compensates the noise, i.e., $\sigma_{[-n,-1]}(t)=-\varepsilon_{[t-n,t-1]}$ .

IV-C Prediction error bound

Denote the optimizers of (6) by $\alpha^{*}(t),\sigma^{*}(t),\bar{u}^{*}(t),\bar{y}^{*}(t)$ , and the output trajectory resulting from an open-loop application of $\bar{u}^{*}(t)$ by $\hat{y}$ . One of the reasons why it is difficult to analyze the presented MPC scheme is the non-trivial relation between the predicted output $\bar{y}^{*}(t)$ and the “actual” output $\hat{y}$ . In the following, we derive a bound on the difference between the two quantities, which will play an important role in proving recursive feasibility and practical stabiliy of the proposed scheme. For an integer $k$ , define constants $\rho_{2,k},\rho_{\infty,k}$ such that

[TABLE]

where $\Phi_{\dagger}$ is a left-inverse of the observability matrix $\Phi$ .

Lemma 2.

If (6) is feasible at time $t$ , then the following inequalities hold for all $k\in\mathbb{I}_{[0,L-1]}$

[TABLE]

with $c_{5}$ from (18).

Proof.

We show only (21) and note that (20) can be derived following the same steps, using (18) as well as the inequality $(a+b)^{2}\leq 2a^{2}+2b^{2}$ . As written above, $\hat{y}$ is the trajectory, resulting from an open-loop application of $\bar{u}^{*}(t)$ and with initial conditions specified by $\left(u_{[t-n,t-1]},\hat{y}_{[t-n,t-1]}\right)=\left(u_{[t-n,t-1]},y_{[t-n,t-1]}\right)$ . On the other hand, according to (6a), $\bar{y}^{*}(t)$ is comprised as

[TABLE]

It follows directly from (6a) and (6b) that the second term on the right-hand side $H_{L+n}\left(y^{d}\right)\alpha^{*}(t)$ is a trajectory of $G$ , resulting from an open-loop application of $\bar{u}^{*}(t)$ and with initial output conditions

[TABLE]

Define

[TABLE]

Since $G$ is LTI and $y^{-}$ contains the difference between two trajectories with the same input, we can assume $\bar{u}^{*}(t)=0$ for the following arguments without loss of generality. Hence, $y^{-}$ is equal to the output component of a trajectory $\left(u^{-},y^{-}\right)$ with zero input and with initial trajectory

[TABLE]

The relation to the internal state $x^{-}$ can be derived as

[TABLE]

with the observability matrix $\Phi$ . This leads to the corresponding output at time $t+k$

[TABLE]

where $\Phi_{\dagger}$ is a left-inverse of $\Phi$ . Using this fact, the expression for $y^{-}_{[t-n,t-1]}$ in (22), and the inequality (15), $\lVert y^{-}_{t+k}\rVert_{\infty}$ can be bounded as

[TABLE]

Note that

[TABLE]

which concludes the proof. ∎

Essentially, Lemma 2 gives a bound on the mismatch between the predicted output and the actual output resulting from the open-loop application of $\bar{u}^{*}(t)$ , depending on the optimal solutions $\alpha^{*},\sigma^{*}$ , and on system parameters. In model-based robust MPC schemes, similar bounds are typically used to propagate uncertainty, where the role of the weighting vector $\alpha$ to account for multiplicative uncertainty is replaced by the state $x$ and a model-based uncertainty description (compare [34] for details). The main difference in the proposed MPC scheme is that the predicted trajectory $\bar{y}^{*}(t)$ is in general not a trajectory of the system in the sense of Definition 2, corresponding to the input $\bar{u}^{*}(t)$ . On the contrary, in model-based robust MPC, the predicted trajectory usually satisfies the dynamics of a (nominal) model of the system.

IV-D Recursive feasibility

The following result shows that, if the proposed robust MPC scheme is feasible at time $t$ , then it is also feasible at time $t+n$ , assuming that the noise level is sufficiently small.

Proposition 1.

Suppose Assumption 2 and 4 hold. Then, for any $V_{ROA}>0$ , there exists an $\bar{\varepsilon}_{0}>0$ such that for all $\bar{\varepsilon}\leq\bar{\varepsilon}_{0}$ , if $V_{t}\leq V_{ROA}$ for some $t\geq 0$ , then the optimization problem (6) is feasible at time $t+n$ .

Proof.

Suppose the robust MPC problem (6) is feasible at time $t$ with $V_{t}\leq V_{ROA}$ and denote the optimizers by $\alpha^{*}(t),\sigma^{*}(t),\bar{u}^{*}(t),\bar{y}^{*}(t)$ . As in Lemma 2, the trajectory resulting from an open-loop application of $\bar{u}^{*}(t)$ and with initial conditions specified by $\left(u_{[t-n,t-1]},y_{[t-n,t-1]}\right)$ is denoted by $\hat{y}$ . For $k\in\mathbb{I}_{[-n,L-2n-1]}$ , we choose for the candidate input the shifted previously optimal solution, i.e., $\bar{u}_{k}^{\prime}(t+n)=\bar{u}_{k+n}^{*}(t)$ . Over the first $n$ steps, the candidate output must satisfy $\bar{y}_{[-n,-1]}^{\prime}(t+n)=\tilde{y}_{[t,t+n-1]}$ due to (6b). Further, for $k\in\mathbb{I}_{[0,L-2n-1]}$ , the output is chosen as $\bar{y}_{k}^{\prime}(t+n)=\hat{y}_{t+n+k}$ . Since $\bar{y}_{[L-n,L-1]}^{*}(t)=0$ by (6c), the prediction error bound of Lemma 2 implies that, for any $k\in\mathbb{I}_{[L-n,L-1]}$ , it holds that

[TABLE]

For $\bar{\varepsilon}_{0}$ sufficiently small, $\lVert\sigma^{*}(t)\rVert_{\infty}$ becomes arbitrarily small due to (6d). Further, using that $\lambda_{\alpha}\bar{\varepsilon}\lVert\alpha^{*}(t)\rVert_{2}^{2}\leq J_{L}^{*}(u_{t-n,t-1]},\tilde{y}_{[t-n,t-1]})\leq V_{ROA}$ , we can bound $\alpha^{*}(t)$ as

[TABLE]

Hence, if $\bar{\varepsilon}_{0}$ is sufficiently small, then $\hat{y}_{t+k}$ becomes arbitrarily small at the above time instants. This implies that the internal state in some minimal realization corresponding to the trajectory $(\bar{u}^{*}(t),\hat{y})$ at time $t+L-n$ , i.e., $\hat{x}_{t+L-n}=\Phi_{\dagger}\hat{y}_{[t+L-n,t+L-1]}$ , approaches zero for $\bar{\varepsilon}\to 0$ . Thus, similar to the proof of Lemma 1, there exists an input trajectory $\bar{u}_{[L-2n,L-n-1]}^{\prime}(t+n)$ , which brings the state and the corresponding output $\bar{y}_{[L-2n,L-n-1]}^{\prime}(t+n)$ to zero in $n$ steps, while satisfying

[TABLE]

Moreover, in the interval $\mathbb{I}_{[L-n,L-1]}$ , we choose $\bar{u}_{[L-n,L-1]}^{\prime}(t+n)=0$ , $\bar{y}_{[L-n,L-1]}^{\prime}(t+n)=0$ , i.e., (6c) is satisfied. The above arguments imply that

[TABLE]

is a trajectory of the unknown LTI system in the sense of Definition 2. Denote the corresponding internal state in some minimal realization by $\bar{x}^{\prime}(t+n)$ . We choose $\alpha^{\prime}(t+n)$ as a corresponding solution to (1), i.e., as

[TABLE]

with $H_{ux}$ from (7). Finally, we fix

[TABLE]

which implies that (6a) holds. It remains to show that the constraint (6d) is satisfied. Over the first $n$ time steps, (6d) holds since

[TABLE]

Further, using the definition of $\sigma^{\prime}(t+n)$ in (25) and the bound (15), we obtain

[TABLE]

and thus, (6d) holds. ∎

Proposition 1 shows that, for any sublevel set of the Lyapunov function $V$ , there exists a sufficiently small noise bound $\bar{\varepsilon}_{0}$ such that, for any $\bar{\varepsilon}\leq\bar{\varepsilon}_{0}$ and any state starting in the sublevel set at time $t$ , the $n$ -step MPC scheme is feasible at time $t+n$ . In particular, the required noise bound decreases if the size of the sublevel set, i.e., $V_{ROA}$ , increases and vice versa. This can be explained by noting that the noise in (6a) corresponds to a multiplicative uncertainty, which affects the prediction accuracy more strongly if the current state is further away from the origin and hence the Lyapunov function $V_{t}$ is larger. We note that this does not imply recursive feasibility of the $n$ -step MPC scheme in the standard sense since it remains to be shown that the sublevel set $V_{t}\leq V_{ROA}$ is invariant, which will be proven in Section IV-E. In our main result, the set of initial states for which $V_{0}\leq V_{ROA}$ will play the role of the guaranteed region of attraction of the closed-loop system.

The input candidate solution used to prove recursive feasibility in Proposition 1 is analogous to a candidate solution one would use to show robust recursive feasibility in model-based robust MPC with terminal equality constraints. The output candidate solution is sketched in Figure 1. Up to time $L-2n-1$ , $\bar{y}^{\prime}(t+n)$ is equal to $\hat{y}$ (shifted by $n$ times steps), which is the output, resulting from an open-loop application of $\bar{u}^{*}(t)$ . This choice together with the prediction error bound of Lemma 2 implies that the internal state corresponding to $\bar{y}^{\prime}(t+n)$ at time $L-2n$ is close to zero. Thus, by controllability, there exists an input trajectory satisfying the input constraints, which brings the state and the output to zero in $n$ steps. In the interval $\mathbb{I}_{[L-2n,L-n-1]}$ , the candidate output is chosen as this trajectory. This also implies that the choice $\bar{y}_{[L-n,L-1]}^{\prime}(t+n)=0$ makes the candidate solution between [math] and $L-1$ , i.e., $\left(\bar{u}_{[0,L-1]}^{\prime}(t+n),\bar{y}_{[0,L-1]}^{\prime}(t+n)\right)$ , a trajectory333 In most practical cases, $(\bar{u}^{*}(t),\bar{y}^{*}(t))$ are not trajectories of the system due to the slack variable $\sigma$ and the noise. of the unknown system $G$ in the sense of Definition 2. Finally, the suggested candidate input is also similar to [35], where inherent robustness of quasi-infinite horizon (model-based) MPC is shown.

Remark 4.

For a $1$ -step MPC scheme, a similar argument to prove recursive feasibility can be applied, given that $\bar{u}^{*}_{[L-2n,L-n-1]}(t)$ and $\bar{y}^{*}_{[L-2n,L-n-1]}(t)$ (and hence $\hat{y}_{[t+L-2n,t+L-n-1]}$ ) are close to zero. This is required to construct a feasible input which steers the state and the corresponding output to zero, similar to the proof of Proposition 1, and it is, e.g., the case if the initial state $x_{t}$ is close to zero. That is, the result of Proposition 1 holds locally for a $1$ -step MPC scheme, as expected based on model-based MPC with terminal equality constraints under disturbances using inherent robustness properties.

Remark 5.

As mentioned in Section IV-A, all of our theoretical guarantees for the presented robust MPC scheme can be straightforwardly extended to the case $(u^{s},y^{s})\neq 0$ , with the corresponding steady-state $\xi^{s}\neq 0$ . The main difference lies in the bound (11), which becomes $V_{t}\leq\tilde{c}_{3}\lVert\xi_{t}-\xi^{s}\rVert_{2}^{2}+\tilde{c}_{4}$ for constants $\tilde{c}_{3}\neq c_{3},\tilde{c}_{4}\neq c_{4}$ , where $\tilde{c}_{3}$ can be made arbitrarily close to $c_{3}$ . On the other hand, $\tilde{c}_{4}$ changes depending on $\xi^{s}$ , since the right-hand side of (17) would need to be proportional to $\lVert\xi_{t}-\xi^{s}\|_{2}^{2}+\lVert\xi^{s}\rVert_{2}^{2}$ . The same phenomenon can be observed in a bound of $\alpha^{\prime}(t+n)$ based on (24), which will be used in the stability proof. As will become clear later in this section, such changes in the bound of $\alpha^{\prime}(t+n)$ as well as in the constant $\tilde{c}_{4}$ do not affect our qualitative theoretical results, but they may potentially (quantitatively) deterioriate the robustness w.r.t. the noise level $\bar{\varepsilon}$ . Intuitively, this can be explained by noting that (6a) corresponds to a multiplicative uncertainty and thus, stabilization of the origin is simpler than stabilization of any other equilibrium. Since equilibria with $(u^{s},y^{s})\neq 0$ require a significantly more involved notation, we omit this extension.

IV-E Practical exponential stability

The following is our main stability result. It shows that, under Assumptions 2 and 4, for a low noise amplitude and large persistence of excitation, and for suitable regularization parameters, the application of the scheme (6) as described in Algorithm 2 leads to a practically exponentially stable closed loop.

Theorem 3.

Suppose Assumptions 2 and 4 hold. Then, for any $V_{ROA}>0$ , there exist constants $\underline{\lambda}_{\alpha},\overline{\lambda}_{\alpha},\underline{\lambda}_{\sigma},\overline{\lambda}_{\sigma}>0$ such that, for all $\lambda_{\alpha},\lambda_{\sigma}$ satisfying

[TABLE]

there exist constants $\bar{\varepsilon}_{0},\bar{c}_{pe}>0$ , as well as a continuous, strictly increasing $\beta:[0,\bar{\varepsilon}_{0}]\to[0,V_{ROA}]$ with $\beta(0)=0$ , such that, for all $\bar{\varepsilon},c_{pe}$ satisfying

[TABLE]

the sublevel set $V_{t}\leq V_{ROA}$ is invariant and $V_{t}$ converges exponentially to $V_{t}\leq\beta(\bar{\varepsilon})$ in closed loop with the $n$ -step MPC scheme for all initial conditions for which $V_{0}\leq V_{ROA}$ .

Proof.

The proof consists of three parts: First, we bound the increase in the Lyapunov function $V$ . Thereafter, we prove that, for suitably chosen bounds on the parameters, there exists a function $\beta$ , which satisfies the above requirements. Finally, we show invariance of the sublevel set $V_{t}\leq V_{ROA}$ and exponential convergence of $V_{t}$ to $V_{t}\leq\beta(\bar{\varepsilon})$ .

**(i). Practical Stability

**Suppose Problem (6) is feasible at time $t$ and let $V_{ROA}>0$ be arbitrary. Further, let $\bar{\varepsilon}_{0}$ be sufficiently small such that Proposition 1 is applicable. The cost of the candidate solution derived in Proposition 1 at time $t+n$ is

[TABLE]

Thus, we obtain for the optimal cost

[TABLE]

In the following key technical part of the proof (Parts (i.i)-(i.iv)), we derive useful bounds for most terms on the right-hand side of (30). This will lead to a decay bound of the optimal cost which is then used to prove practical exponential stability of the closed loop.

**(i.i) Stage Cost Bounds

**We first bound those terms in (30), which involve the stage cost. The above difference can be decomposed as

[TABLE]

where we use that $\bar{u}_{k}^{\prime}(t+n),\bar{y}_{k}^{\prime}(t+n),\bar{u}_{k}^{*}(t),\bar{y}_{k}^{*}(t)$ are all zero for $k\in\mathbb{I}_{[L-n,L-1]}$ due to (6c). To bound the first term on the right-hand side of (31), note that

[TABLE]

with $\hat{x}_{t+L-n}$ as in the proof of Proposition 1. Further, since $\bar{y}_{[L-n,L-1]}^{*}(t)=0$ , $\hat{y}$ can be bounded in the considered time interval as in (20), i.e.,

[TABLE]

Hence, it holds that

[TABLE]

Next, we bound the difference between the third and the fourth term on the right-hand side of (31). The following relations are readily derived:

[TABLE]

By using $2\lVert\bar{y}_{k+n}^{*}(t)\rVert_{Q}\leq 1+\lVert\bar{y}_{k+n}^{*}(t)\rVert_{Q}^{2}$ as well as $\lVert\bar{y}_{k+n}^{*}(t)\rVert_{Q}^{2}\leq V_{ROA}$ , we arrive at

[TABLE]

Therefore, since the inputs coincide over the considered time interval, and due to (33) as well as (34), it holds that

[TABLE]

The difference $\lVert\bar{y}_{k}^{\prime}(t+n)-\bar{y}_{k+n}^{*}(t)\rVert_{Q}$ can be bounded similar to Lemma 2. Using the constraint (6d) to bound $\lVert\sigma^{*}(t)\rVert_{2}$ , it can be shown that the bound is of the form $\lVert\bar{y}_{k}^{\prime}(t+n)-\bar{y}_{k+n}^{*}(t)\rVert_{Q}\leq\tilde{C}_{1}\lVert\alpha^{*}(t)\rVert_{2}+\tilde{C}_{2}\leq\tilde{C}_{1}\left(1+\lVert\alpha^{*}(t)\rVert_{2}^{2}\right)+\tilde{C}_{2}$ , where both $\tilde{C}_{1}$ and $\tilde{C}_{2}$ are proportional to $\bar{\varepsilon}$ . Hence, applying Lemma 2 to (35), the sum of (32) and (35) can be bounded as $C_{1}\lVert\alpha^{*}(t)\rVert_{2}^{2}+C_{2}\lVert\sigma^{*}(t)\rVert_{2}^{2}+C_{3}$ for suitable $C_{i}>0$ , where $C_{1}$ and $C_{3}$ are quadratic in $\bar{\varepsilon}$ and vanish for $\bar{\varepsilon}=0$ . Therefore, if $\underline{\lambda}_{\alpha}$ and $\underline{\lambda}_{\sigma}$ are sufficiently large, then (30) implies

[TABLE]

for a suitable constant $c_{6}>0$ , which is quadratic in $\bar{\varepsilon}$ and vanishes for $\bar{\varepsilon}=0$ .

**(i.ii) Bound of $\mathbf{\lVert\sigma^{\prime}(t+n)\rVert_{2}^{2}}$

**By applying standard norm bounds to the slack variable candidate $\sigma^{\prime}(t+n)$ as defined in (25) (compare also (26) and (27)), we obtain

[TABLE]

with $c_{5}=p(N-L-n+1)$ as in (18).

**(i.iii) Bound of $\mathbf{\lVert\alpha^{\prime}(t+n)\rVert_{2}^{2}}$

**For the weighting vector $\alpha^{\prime}(t+n)$ , it holds that

[TABLE]

Similar to (32), we can use (23) to bound the last term as

[TABLE]

The bound (38) is of the same form as (32) and (35). Due to this fact, using the bound (37) for $\sigma^{\prime}(t+n)$ , and by potentially choosing $\underline{\lambda}_{\alpha}$ and $\underline{\lambda}_{\sigma}$ larger, (30) implies

[TABLE]

for suitable constants $c_{7},c_{8}>0$ , which vanish for $\bar{\varepsilon}=0$ .

**(i.iv) IOSS Bound

**As in the proof of Theorem 2, we consider now $V_{t}=J_{L}^{*}(\tilde{\xi}_{t})+\gamma W(\xi_{t})$ with the IOSS Lyapunov function $W$ for some $\gamma>0$ . It follows directly from (5), (39), and from $\lVert x_{t}\rVert_{2}^{2}\leq\Gamma_{x}\lVert\xi_{t}\rVert_{2}^{2}$ that

[TABLE]

The identity $(a+b)^{2}\leq 2(a^{2}+b^{2})$ yields

[TABLE]

where the latter term can again be bounded using Lemma 2. Similar to the earlier steps of this proof, the components of the bound $\lVert y_{[t,t+n-1]}-\bar{y}_{[0,n-1]}^{*}(t)\rVert_{2}^{2}$ vanish in (40) if $\underline{\lambda}_{\sigma},\underline{\lambda}_{\alpha}$ are chosen sufficiently large, except for an additive constant, which depends solely on the noise. Moreover, choosing $\gamma=\frac{\lambda_{\min}(Q,R)}{\max\{c_{1},2c_{2}\}}$ , it holds that

[TABLE]

Combining these facts, we arrive at

[TABLE]

for a suitable constant $c_{9}$ , which vanishes for $\bar{\varepsilon}=0$ . Finally, note that $\lambda_{\min}(R)\lVert\bar{u}_{[0,L-n-1]}^{*}(t)\rVert_{2}^{2}\leq V_{t}$ , which leads to

[TABLE]

**(ii). Construction of $\mathbf{\beta}$

**The local upper bound in Lemma 1, which holds for any $\xi_{t}\in\mathbb{B}_{\delta}$ , implies that the following holds for any $V_{ROA}>0$ , and any $\xi_{t}$ with $V_{t}\leq V_{ROA}$ :

[TABLE]

We first consider $V_{ROA}=\delta^{2}c_{3}+c_{4}$ , which implies $c_{3,V_{ROA}}=c_{3}$ . Further, we define $c_{12}\coloneqq\frac{\gamma}{2}-c_{10}-c_{3}c_{11}$ as well as

[TABLE]

for any $\bar{\varepsilon}$ for which $c_{12}>0$ . Recall that $c_{3}=a_{1}\bar{\varepsilon}^{2}+a_{2}\bar{\varepsilon}+a_{3},c_{4}=a_{4}\bar{\varepsilon}^{2},c_{9}=a_{5}\bar{\varepsilon}^{2}+a_{6}\bar{\varepsilon},c_{10}=a_{7}\bar{\varepsilon}^{2}+a_{8}\bar{\varepsilon},c_{11}=a_{9}\bar{\varepsilon}^{2}+a_{10}\bar{\varepsilon}$ , for suitable constants $a_{i}>0$ . This implies $\beta(0)=0$ . Next, we show the existence of a constant $\bar{\varepsilon}_{0}$ such that $\beta$ is strictly increasing on $[0,\bar{\varepsilon}_{0}]$ . If $c_{12}>0$ , then $\beta$ is strictly increasing since its numerator increases with $\bar{\varepsilon}$ whereas its denominator decreases with $\bar{\varepsilon}$ . In the following, we show that $c_{12}>0$ . By definition, we have

[TABLE]

It can be seen directly from this expression that, if $\lambda_{\alpha}\leq\overline{\lambda}_{\alpha},\lambda_{\sigma}\leq\overline{\lambda}_{\sigma}$ , with arbitrary but fixed upper bounds $\overline{\lambda}_{\alpha},\overline{\lambda}_{\sigma}$ , and $c_{pe}\bar{\varepsilon}$ is sufficiently small, then $c_{12}>0$ . It remains to show that $\beta(\bar{\varepsilon}_{0})\leq V_{ROA}$ , or, equivalently,

[TABLE]

which can be ensured by choosing $\bar{\varepsilon}_{0}$ sufficiently small.

**(iii). Invariance and Exponential Convergence

**Take an arbitrary $\xi_{t}$ with $V_{t}\leq V_{ROA}$ and note that this implies that (6) is feasible and thus, (41) and (42) hold. Moreover, $c_{12}>0$ implies $c_{10}<\frac{\gamma}{2}$ . Defining $V_{\beta,t}\coloneqq V_{t}-\beta(\bar{\varepsilon})$ , we thus obtain

[TABLE]

where the last inequality follows from elementary computations. This in turn implies the following contraction property

[TABLE]

If the noise bound $\bar{\varepsilon}_{0}$ is sufficiently small, then this implies invariance of the sublevel set $V_{t}\leq V_{ROA}$ and hence, by Proposition 1, recursive feasibility of the $n$ -step MPC scheme. Applying the contraction property (43) recursively, we can thus conclude that $V_{t}$ converges exponentially to $V_{t}\leq\beta(\bar{\varepsilon})$ .

So far, we have only considered the case $V_{ROA}=\delta^{2}c_{3}+c_{4}$ . It remains to show that, for any $V_{ROA}>0$ , there exist suitable parameter bounds such that

[TABLE]

with $c_{3,V_{ROA}}$ from (42). It is easily seen from the above discussion that, for any fixed $V_{ROA}>0$ and for fixed bounds $\overline{\lambda}_{\alpha},\overline{\lambda}_{\sigma}$ , $c_{12,V_{ROA}}>0$ can always be ensured if $c_{pe}\bar{\varepsilon}$ is sufficiently small, i.e., if the bound $\bar{c}_{pe}$ is sufficiently small. ∎

Theorem 3 shows that the closed loop of the proposed data-driven MPC scheme admits a (practical) Lyapunov function, which converges robustly and exponentially to a set, whose size shrinks with the noise level. Since $\lVert\xi_{t}\rVert_{2}^{2}\leq\frac{1}{\gamma\lambda_{\min}(P)}V_{t}$ due to (11), this implies practical exponential stability of the equilibrium $\xi=0$ . The result requires that the noise level $\bar{\varepsilon}$ is small, the amount of persistence of excitation is large compared to the noise level (i.e., $c_{pe}\bar{\varepsilon}$ is small), and the regularization parameters are chosen suitably. Concerning the latter requirement, $\lambda_{\alpha}$ cannot be chosen arbitrarily large, which can be explained by noting that the optimal $\alpha$ is usually not zero, even in the noise-free case. On the other hand, $\lambda_{\alpha}$ cannot be too close to zero since solutions $\alpha(t)$ of (6a) are not unique and large choices of $\alpha(t)$ amplify the influence of the noise in $\tilde{y}^{d}$ on the prediction accuracy. Further, $\lambda_{\sigma}$ has to be chosen sufficiently large to ensure stability, but not arbitrarily large for a fixed noise level. To be more precise, $\lambda_{\alpha}c_{pe}\bar{\varepsilon}$ and $\lambda_{\sigma}c_{pe}\bar{\varepsilon}^{2}$ have to be small, i.e., for a fixed $c_{pe}$ , choosing the regularization parameters too large deteriorates the robustness of the scheme w.r.t. the noise level. One can show that the theoretical properties in Theorem 3 are also valid without imposing the lower bound in (28) on $\lambda_{\sigma}$ , by using the more conservative constraint (6d) in the proof. However, (6d) is non-convex (cf. Remark 3), but can typically be enforced implicitly if $\lambda_{\sigma}$ is chosen large enough.

In the proof of Theorem 3, a close connection between the region of attraction, i.e., the set of initial conditions with $V_{0}\leq V_{ROA}$ , and various parameters becomes apparent. First of all, the noise bound $\bar{\varepsilon}$ needs to be sufficiently small depending on $V_{ROA}$ to allow for an application of Proposition 1. Moreover, if $V_{ROA}$ increases, then also $c_{3,V_{ROA}}$ increases and hence, $c_{11}$ must decrease to ensure $c_{12,V_{ROA}}>0$ and thereby exponential stability. To render $c_{11}$ small, $c_{pe}\bar{\varepsilon}$ must decrease, i.e., the amount of persistence of excitation compared to the noise level must increase. Thus, for $c_{pe}\bar{\varepsilon}\to 0$ (and a sufficiently small noise bound $\bar{\varepsilon}$ due to Proposition 1), the region of attraction approaches the set of all initially feasible points. For a fixed $c_{pe}$ , the size of the region of attraction increases if the noise level decreases and vice versa. A similar connection between the maximal disturbance and the region of attraction can be found in [35], which studies inherent robustness properties of quasi-infinite horizon MPC (but the result applies similarly to model-based $n$ -step MPC with terminal equality constraints). Further, if $c_{pe}$ decreases then so do $c_{10}$ as well as $c_{11}$ and hence also $\beta(\bar{\varepsilon})$ . This implies that larger persistence of excitation (i.e., a lower $c_{pe}\bar{\varepsilon}$ ) does not only increase the region of attraction but it also reduces the tracking error.

Remark 6.

To apply the proposed data-driven MPC scheme in practice, the following ingredients are required. First of all, the design parameters in the cost, i.e., $Q,R,\lambda_{\alpha},\lambda_{\sigma}$ , have to be selected suitably. The proof and discussion of Theorem 3 give a qualitative guideline for choosing the regularization parameters. Further, as in the nominal case (Section III), measured data with a persistently exciting input as well as a (potentially rough) upper bound on the system’s order need to be available. Finally, an upper bound on the noise level $\bar{\varepsilon}$ is required.

While these ingredients suffice to apply the proposed scheme, computing bounds as in (28) and (29) is a difficult task in practice. Theorem 3 should be interpreted as a qualitative result which illustrates a) the influence of the regularization parameters on stability and robustness of the presented MPC scheme and b) that large persistence of excitation (compared to the noise level) increases the region of attraction and reduces the tracking error. Further, many of the employed bounds rely on conservative estimates such as $(a+b)^{2}\leq 2a^{2}+2b^{2}$ . In principle, it is possible to improve some of the quantitative estimates at the price of a more involved notation. Nevertheless, such improved estimates may lead to meaningful, non-conservative, verifiable conditions on the noise level $\bar{\varepsilon}$ for closed-loop stability, and are therefore an interesting issue for future research.

Remark 7.

In the nominal MPC scheme (3) as well as in its robust modification (6), the data $(u^{d},y^{d})$ used for prediction is fixed. Alternatively, one may update the data using online measurements, given that the closed loop is persistently exciting. Indeed, we believe that one of the main advantages of the proposed scheme is its ability to cope (locally) with nonlinear components of the unknown system. Nonlinear dynamical systems are in general difficult to identify and thus, the proposed approach may be simpler than a model-based MPC scheme with prior system identification. As illustrated in [21] with an application of a similar MPC scheme to a nonlinear stochastic quadcopter system, the approach is already applicable in practice to time-varying or nonlinear dynamics without updating the data online. Providing theoretical guarantees for the application of the proposed scheme to a nonlinear system is an interesting and relevant problem for future research.

Similar to the nominal MPC scheme, it is easy to see that the only free decision variables of Problem (6) are $\alpha(t)$ and $\sigma(t)$ with at least $m(L+2n)+n$ and $p(L+n)$ free parameters, respectively (cf. Remark 1). On the contrary, to implement a model-based MPC scheme (with state measurements), $mL$ parameters are required. When neglecting the constraint (6d) (cf. Remark 3), the slack variable $\sigma(t)$ can be eliminated from (6) by directly penalizing the norm of the model mismatch $\bar{y}(t)-H_{L+n}(\tilde{y}^{d})\alpha(t)$ in the cost. Hence, considering the minimal amount of data required for persistence of excitation, Problem (6) has roughly the same number of decision variables as a model-based MPC problem. In contrast to the nominal case, however, Theorem 3 implies that larger data horizons $N$ are beneficial for the theoretical properties of the proposed scheme as they typically decrease the constant $c_{pe}$ . On the other hand, increasing values for $N$ also lead to an increasing online complexity of (6) since $\alpha(t)\in\mathbb{R}^{N-L+1}$ , i.e., the presented MPC approach allows for a tradeoff between computational complexity and desired closed-loop performance by appropriately selecting $N$ .

On the contrary, the performance of identification-based MPC typically improves if larger amounts of data are employed, whereas the online complexity is independent of $N$ . However, while the scheme presented in this paper provides end-to-end guarantees for the closed loop using noisy data of finite length, the derivation of non-conservative estimation bounds on system parameters from such data, which would be required for guarantees in model-based MPC, is difficult in general and an active field of research [36, 37]. An extensive quantitative comparison of model-based MPC and the proposed data-driven MPC in theory and for practical examples is an interesting issue for future research.

V Example

In this section, we apply the robust data-driven MPC scheme of Section IV to a four tank system, which has been considered in [38]. This system is well-known as a real-world example, which is open-loop stable, but can be destabilized by an MPC without terminal constraints if the prediction horizon is too short. Similarly, we show in this section that our proposed scheme is able to track a specified setpoint, whereas a scheme without terminal constraints as suggested in [20, 21, 22] leads to an unstable closed loop, unless it is suitably modified.

We consider a linearized version of the system from [38], which takes the form

[TABLE]

For the following application of the robust data-driven MPC scheme, the system matrices are unknown and only measured input-output data is available. The control goal is tracking of the setpoint of the linearized system

[TABLE]

which is readily shown to satisfy the dynamics. We consider no constraints on the input or the output. In an open-loop experiment, an input-output trajectory of length $N=400$ is measured, where the input is chosen randomly from the unit interval, i.e., $u^{d}_{k}\in[-1,1]^{2}$ , and the output is subject to uniformly distributed additive measurement noise with bound $\bar{\varepsilon}=0.002$ . The online measurements used to update the initial conditions (6b) in the MPC scheme are subject to the same type of noise.

We choose $L=30$ for the prediction horizon as well as the following design parameters

[TABLE]

The closed-loop output resulting from the application of Problem (6) in a $1$ -step MPC scheme is displayed in Figure 2. It can be seen that the control goal is fulfilled, with only slight deviations from the desired equilibrium. On the other hand, if the same scheme without terminal constraints is applied to the system, then the closed loop is unstable and diverges with the chosen parameters for both a $1$ -step and an $n$ -step MPC scheme (cf. again Figure 2). This confirms our initial motivation that rigorous guarantees are indeed desirable for data-driven MPC methods, in particular when they are applied to practical systems. Furthermore, it can also be observed in Figure 2 that an $n$ -step version of the proposed MPC scheme with terminal equality constraints yields slightly better tracking accuracy, compared to the $1$ -step scheme. We note that, with the above choice of parameters, the non-convex constraint (6d) is automatically satisfied without enforcing it explicitly (cf. Remark 3).

Theorem 3 gives qualitative guidelines for the tuning of the design parameters to guarantee robust stability. In the following, we analyze the influence of various parameters on the closed-loop behavior. Theorem 3 requires that the regularization parameters lie within specific bounds. This is confirmed for the present example, where the MPC scheme achieves desirable closed-loop performance similar to Figure 2 as long as $0.05\leq\lambda_{\alpha}\bar{\varepsilon}\leq 0.5$ . If $\lambda_{\alpha}$ is chosen too low, then the closed loop is unstable since the norm of $\alpha^{*}(t)$ and hence the amplification of the measurement noise in (6a) is too large. On the contrary, if $\lambda_{\alpha}$ is chosen too large, then the asymptotic tracking error increases since the cost term $\lambda_{\alpha}\bar{\varepsilon}\lVert\alpha^{*}(t)\rVert_{2}$ dominates over the tracking cost. Similarly, if $\lambda_{\sigma}<500$ , then the closed loop may be unstable since we did not consider the constraint (6d) and therefore the slack variable is too large, which has a negative impact on the prediction accuracy. An upper bound on $\lambda_{\sigma}$ beyond which the closed-loop behavior is undesirable could not be observed for the present example. Further, if the input weighting $R$ is chosen too low, then the robustness with respect to the noise deteriorates, which can be explained via the bound (41), which grows with $1/\lambda_{\min}(R)$ . If the input weighting is chosen large enough, then also an MPC scheme without terminal constraints stabilizes the desired equilibrium.

Regarding the knowledge of the system order $n=4$ , it suffices if an upper bound on $n$ is available, i.e., if for instance $n=10$ is used in (6). If the system order is assumed lower than $n=4$ , then the closed loop can be unstable. The prediction horizon $L$ can be chosen (roughly) between $7\leq L\leq 70$ . The upper bound can be explained by noting that a larger $L$ implies that the constant $c_{pe}$ increases (compare the discussion after (9)) and therefore, the asymptotic tracking error increases. On the other hand, the lower bound is due to the terminal equality constraints which require local controllability. Moreover, the steady-state tracking error, which can be seen e.g. in Figure 2 (b), may increase or decrease, depending on the particular noise instance, and generally increases with the noise level $\bar{\varepsilon}$ . This confirms again the analysis of Section IV, which showed exponential stability of a set which grows with the noise level. Finally, if the norm of the data input $u^{d}$ increases (i.e., $c_{pe}$ decreases), then the tracking error decreases.

VI Conclusion

In the present paper, we proposed and analyzed a novel MPC scheme with terminal equality constraints, which uses only past measured data for the prediction, without any prior system identification step. We showed that, for a low noise amplitude, for a large ratio between persistence of excitation and the noise level, and for suitably tuned parameters, the closed loop in an $n$ -step MPC scheme is recursively feasible and practically exponentially stable w.r.t. the noise level. To the best of our knowledge, we have provided the first analysis regarding recursive feasibility and stability for a purely data-driven (model-free) MPC scheme. Further, the analysis provides qualitative guidelines to choose the design parameters, and it illustrates the influence of other parameters, such as a persistence of excitation bound, on the region of attraction. While the MPC scheme is simple to implement, its analysis is challenging since we consider two sorts of noise: a) additive output noise and b) in the prediction model, similar to a multiplicative, parametric error in model-based MPC. In an application to a practical example, we showed that the proposed MPC scheme guarantees stability, whereas an existing data-driven MPC scheme without terminal constraints leads to an unstable closed loop.

Several topics for future research are left open. Extensions of the presented data-driven MPC approach to online optimization over artificial equilibria and robust output constraint satisfaction are provided in the recent works [27] and [31], respectively. Another extension, which would be highly interesting but also challenging, is the development of data-driven MPC schemes for nonlinear systems with meaningful closed-loop guarantees. Finally, many of the bounds employed in our proofs are conservative, and improving them may lead to less conservative, verifiable conditions on the admissible noise level for closed-loop stability.

Bibliography38

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Z.-S. Hou and Z. Wang, “From model-based control to data-driven control: Survey, classification and perspective,” Information Sciences , vol. 235, pp. 3–35, 2013,
2[2] B. Recht, “A tour of reinforcement learning: The view from continuous control,” Annual Review of Control, Robotics, and Autonomous Systems , 2018.
3[3] J. B. Rawlings, D. Q. Mayne, and M. M. Diehl, Model Predictive Control: Theory, Computation, and Design , 2nd ed. Nob Hill Pub, 2017.
4[4] L. Ljung, System Identification: Theory for the User . Prentice-Hall, Englewood Cliffs, NJ, 1987.
5[5] V. Adetola and M. Guay, “Robust adaptive MPC for constrained uncertain nonlinear systems,” International Journal of Adaptive Control and Signal Processing , vol. 25, no. 2, pp. 155–167, 2011.
6[6] A. Aswani, H. Gonzalez, S. S. Sastry, and C. Tomlin, “Provably safe and robust learning-based model predictive control,” Automatica , vol. 49, no. 5, pp. 1216–1226, 2013.
7[7] M. Tanaskovic, L. Fagiano, R. Smith, and M. Morari, “Adaptive receding horizon control for constrained MIMO systems,” Automatica , vol. 50, no. 12, pp. 3019–3029, 2014.
8[8] F. Berkenkamp, M. Turchetta, A. Schoellig, and A. Krause, “Safe model-based reinforcement learning with stability guarantees,” in Advances in Neural Information Processing Systems , 2017, pp. 908–918.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Data-Driven Model Predictive Control with Stability and Robustness Guarantees

Abstract

Index Terms:

I Introduction

II Preliminaries

Definition 1**.**

Definition 2**.**

Theorem 1** ([16]).**

Definition 3**.**

III Nominal data-driven MPC

III-A Nominal MPC scheme

III-B Closed-loop guarantees

Assumption 1**.**

Assumption 2**.**

Assumption 3**.**

Theorem 2**.**

Proof.

Remark 1**.**

IV Robust data-driven MPC

IV-A Robust MPC scheme

Remark 2**.**

Remark 3**.**

Assumption 4**.**

IV-B Local upper bound of Lyapunov function

Lemma 1**.**

Proof.

IV-C Prediction error bound

Lemma 2**.**

Proof.

IV-D Recursive feasibility

Proposition 1**.**

Proof.

Remark 4**.**

Remark 5**.**

IV-E Practical exponential stability

Theorem 3**.**

Proof.

Remark 6**.**

Remark 7**.**

V Example

VI Conclusion

Definition 1.

Definition 2.

Theorem 1 ([16]).

Definition 3.

Assumption 1.

Assumption 2.

Assumption 3.

Theorem 2.

Remark 1.

Remark 2.

Remark 3.

Assumption 4.

Lemma 1.

Lemma 2.

Proposition 1.

Remark 4.

Remark 5.

Theorem 3.

Remark 6.

Remark 7.