Online Simultaneous State and Parameter Estimation for Second-order   Nonlinear Systems

Rushikesh Kamalapurkar

arXiv:1703.07068·eess.SY·December 6, 2024

Online Simultaneous State and Parameter Estimation for Second-order Nonlinear Systems

Rushikesh Kamalapurkar

PDF

TL;DR

This paper introduces a novel online adaptive observer for second-order nonlinear systems that estimates states and parameters simultaneously without needing persistent excitation, using a Lyapunov-based approach.

Contribution

It presents a concurrent learning-based method for real-time state and parameter estimation in nonlinear systems, reducing excitation requirements compared to traditional methods.

Findings

01

Estimation errors are uniformly ultimately bounded.

02

The method works with finite excitation intervals.

03

No persistent excitation needed for convergence.

Abstract

In this paper, a concurrent learning based adaptive observer is developed for a class of second-order nonlinear time-invariant systems with uncertain dynamics. The developed technique results in simultaneous online state and parameter estimation. A Lyapunov-based analysis is used to show that the state and parameter estimation errors are uniformly ultimately bounded. As opposed to persistent excitation which is required for parameter estimation in traditional adaptive control methods, the developed technique only requires excitation over a finite time interval.

Tables2

Table 1. TABLE I : Sensitivity analysis for the linear system. The nominal values of τ 1 , τ 2 , τ 3 , β 1 , subscript 𝜏 1 subscript 𝜏 2 subscript 𝜏 3 subscript 𝛽 1 \tau_{1},\tau_{2},\tau_{3},\beta_{1}, and k θ subscript 𝑘 𝜃 k_{\theta} were selected to be τ 1 = 1.5 , τ 2 = 1.2 , τ 3 = 1.0 , β 1 = 0.4 , formulae-sequence subscript 𝜏 1 1.5 formulae-sequence subscript 𝜏 2 1.2 formulae-sequence subscript 𝜏 3 1.0 subscript 𝛽 1 0.4 \tau_{1}=1.5,\tau_{2}=1.2,\tau_{3}=1.0,\beta_{1}=0.4, and k θ = 2 / N subscript 𝑘 𝜃 2 𝑁 k_{\theta}=2/N . A zero-mean Gaussian noise with variance 0.001 was used with a step size Δ t = 0.001 subscript Δ 𝑡 0.001 \Delta_{t}=0.001 .


Parameter	Tested Values	RMS Error Variation	Steady-State RMS Error Variation
$τ_{1}$	1.1 - 2.0	55.91 - 64.21	0.1255 - 0.1548
$τ_{2}$	0.8 - 1.7	56.22 - 65.61	0.1134 - 0.1339
$τ_{3}$	0.6 - 1.5	56.98 - 64.98	0.1206 - 0.1337
$β_{1}$	0.05 - 0.9	58.50 - 63.04	0.1265 - 0.2509
$k_{θ}$	$0.5 / N$ - $4 / N$	58.14 - 62.62	0.1161 - 0.1266

Table 2. TABLE II : Sensitivity analysis for the nonlinear system. The nominal values of τ 1 , τ 2 , β 1 , subscript 𝜏 1 subscript 𝜏 2 subscript 𝛽 1 \tau_{1},\tau_{2},\beta_{1}, and k θ subscript 𝑘 𝜃 k_{\theta} were selected to be τ 1 = 1.2 , τ 2 = 0.9 , β 1 = 0.7 , formulae-sequence subscript 𝜏 1 1.2 formulae-sequence subscript 𝜏 2 0.9 subscript 𝛽 1 0.7 \tau_{1}=1.2,\tau_{2}=0.9,\beta_{1}=0.7, and k θ = 0.5 / N subscript 𝑘 𝜃 0.5 𝑁 k_{\theta}=0.5/N . The zero-mean Gaussian noise with variance 0.001 was used with a step size Δ t = 0.001 subscript Δ 𝑡 0.001 \Delta_{t}=0.001 .


Parameter	Tested Values	RMS Error Variation	Steady-State RMS Error Variation
$τ_{1}$	0.8 - 1.7	0.998 - 3.848	0.0325 - 0.1339
$τ_{2}$	0.5 - 1.4	1.011 - 3.546	0.0294 - 0.1270
$β_{1}$	0.1 - 1.2	1.224 - 1.763	0.0324 - 0.3273
$k_{θ}$	$0.01 / N$ - $2 / N$	1.090 - 1.684	0.0296 - 0.0515

Equations201

\overset{x}{˙}_{1}

\overset{x}{˙}_{1}

\overset{x}{˙}_{2}

\overset{x}{˙}_{3}

y

{\overset{x}{˙}_{i} = x_{i + 1}}_{i = 1}^{N - 1}, \overset{x}{˙}_{N} = f (x, u), y = x^{-},

{\overset{x}{˙}_{i} = x_{i + 1}}_{i = 1}^{N - 1}, \overset{x}{˙}_{N} = f (x, u), y = x^{-},

{\overset{x}{˙}_{i} = f_{i} (x^{-}, u) + x_{i + 1}}_{i = 1}^{N - 1}, \overset{x}{˙}_{N} = f (x, u), y = x^{-},

{\overset{x}{˙}_{i} = f_{i} (x^{-}, u) + x_{i + 1}}_{i = 1}^{N - 1}, \overset{x}{˙}_{N} = f (x, u), y = x^{-},

\overset{x}{˙}_{3} = f^{o} (x, u) + θ^{T} σ (x, u) + ϵ (x, u) .

\overset{x}{˙}_{3} = f^{o} (x, u) + θ^{T} σ (x, u) + ϵ (x, u) .

\int_{t - τ_{2}}^{t} (x_{3} (ζ_{2}) - x_{3} (ζ_{2} - τ_{1})) d ζ_{2} = [I_{2} f^{o}] (t) + θ^{T} [I_{2} σ] (t) + [I_{2} ϵ] (t), \forall t \in R_{\geq T}, where T = T_{0} + τ_{1} + τ_{2},

\int_{t - τ_{2}}^{t} (x_{3} (ζ_{2}) - x_{3} (ζ_{2} - τ_{1})) d ζ_{2} = [I_{2} f^{o}] (t) + θ^{T} [I_{2} σ] (t) + [I_{2} ϵ] (t), \forall t \in R_{\geq T}, where T = T_{0} + τ_{1} + τ_{2},

f \mapsto \int_{t - τ_{2}}^{t} \int_{ζ_{2} - τ_{1}}^{ζ_{2}} f (x (ζ_{1}), u (ζ_{1})) d ζ_{1} d ζ_{2} .

f \mapsto \int_{t - τ_{2}}^{t} \int_{ζ_{2} - τ_{1}}^{ζ_{2}} f (x (ζ_{1}), u (ζ_{1})) d ζ_{1} d ζ_{2} .

x_{2} (t - τ_{2} - τ_{1}) - x_{2} (t - τ_{1}) - x_{2} (t - τ_{2}) + x_{2} (t) = [I_{1} f_{2}] (t) - [I_{1} f_{2}] (t - τ_{1}) + [I_{2} f^{o}] (t) + θ^{T} [I_{2} σ] (t) + [I_{2} ϵ] (t), \forall t \in R_{\geq T},

x_{2} (t - τ_{2} - τ_{1}) - x_{2} (t - τ_{1}) - x_{2} (t - τ_{2}) + x_{2} (t) = [I_{1} f_{2}] (t) - [I_{1} f_{2}] (t - τ_{1}) + [I_{2} f^{o}] (t) + θ^{T} [I_{2} σ] (t) + [I_{2} ϵ] (t), \forall t \in R_{\geq T},

f \mapsto \int_{t - τ_{2}}^{t} f (x_{1} (ζ_{2}), u (ζ_{2})) d ζ_{2} .

f \mapsto \int_{t - τ_{2}}^{t} f (x_{1} (ζ_{2}), u (ζ_{2})) d ζ_{2} .

X (t) = F (t) + θ^{T} G (t) + E (t), \forall t \in R_{\geq T_{0}},

X (t) = F (t) + θ^{T} G (t) + E (t), \forall t \in R_{\geq T_{0}},

X (t) \mathchar 58 = ⎩ ⎨ ⎧ x_{2} (t - τ_{2} - τ_{1}) - x_{2} (t - τ_{1}) - x_{2} (t - τ_{2}) + x_{2} (t), 0, t \in [T, \infty), t < T,

X (t) \mathchar 58 = ⎩ ⎨ ⎧ x_{2} (t - τ_{2} - τ_{1}) - x_{2} (t - τ_{1}) - x_{2} (t - τ_{2}) + x_{2} (t), 0, t \in [T, \infty), t < T,

F (t) \mathchar 58 = ⎩ ⎨ ⎧ [I_{1} f_{2}] (t) - [I_{1} f_{2}] (t - τ_{1}) + [I_{2} f^{o}] (t), 0, t \in [T, \infty), t < T,

F (t) \mathchar 58 = ⎩ ⎨ ⎧ [I_{1} f_{2}] (t) - [I_{1} f_{2}] (t - τ_{1}) + [I_{2} f^{o}] (t), 0, t \in [T, \infty), t < T,

G (t) \mathchar 58 = {[I_{2} σ] (t), 0, t \in [T, \infty), t < T,

G (t) \mathchar 58 = {[I_{2} σ] (t), 0, t \in [T, \infty), t < T,

E (t) \mathchar 58 = {[I_{2} ϵ] (t), 0, t \in [T, \infty), t < T .

E (t) \mathchar 58 = {[I_{2} ϵ] (t), 0, t \in [T, \infty), t < T .

\dot{\overset{x}{^}}_{1}

\dot{\overset{x}{^}}_{1}

\dot{\overset{x}{^}}_{2}

\dot{\overset{x}{^}}_{3}

\tilde{x} = x - \overset{x}{^}, \tilde{θ} = θ - \hat{θ},

\tilde{x} = x - \overset{x}{^}, \tilde{θ} = θ - \hat{θ},

\dot{\tilde{x}}_{1} =

\dot{\tilde{x}}_{1} =

ν = α^{2} \tilde{x}_{2} - (k + α + β) η,

ν = α^{2} \tilde{x}_{2} - (k + α + β) η,

\dot{ζ}

\dot{ζ}

η

r \mathchar 58 = \dot{\tilde{x}}_{2} + α \tilde{x}_{2} - \tilde{f}_{2} (x^{-}, u, \overset{x}{^}^{-}) + η,

r \mathchar 58 = \dot{\tilde{x}}_{2} + α \tilde{x}_{2} - \tilde{f}_{2} (x^{-}, u, \overset{x}{^}^{-}) + η,

\overset{η}{˙} = - β η - k r - α \tilde{x}_{3}, η (T_{0}) = 0.

\overset{η}{˙} = - β η - k r - α \tilde{x}_{3}, η (T_{0}) = 0.

(x^{-}, u, \tilde{x}^{-}) \in \tilde{χ} max \tilde{f}_{1} (x^{-}, u, \overset{x}{^}^{-}) \leq L \tilde{x}^{-},

(x^{-}, u, \tilde{x}^{-}) \in \tilde{χ} max \tilde{f}_{1} (x^{-}, u, \overset{x}{^}^{-}) \leq L \tilde{x}^{-},

(x^{-}, u, \tilde{x}^{-}) \in \tilde{χ} max \tilde{f}_{2} (x^{-}, u, \overset{x}{^}^{-}) \leq L \tilde{x}^{-} .

(x^{-}, u, \tilde{x}^{-}) \in \tilde{χ} max \tilde{f}_{2} (x^{-}, u, \overset{x}{^}^{-}) \leq L \tilde{x}^{-} .

X_{i} = \hat{F}_{i} + θ^{T} \hat{G}_{i} + E_{i}, \forall i \in {1, \dots, M},

X_{i} = \hat{F}_{i} + θ^{T} \hat{G}_{i} + E_{i}, \forall i \in {1, \dots, M},

X_{i} = X (t_{i}), \hat{F}_{i} = \hat{F} (t_{i}), \hat{G}_{i} = \hat{G} (t_{i}),

X_{i} = X (t_{i}), \hat{F}_{i} = \hat{F} (t_{i}), \hat{G}_{i} = \hat{G} (t_{i}),

\hat{F} (t) \mathchar 58 = ⎩ ⎨ ⎧ [\hat{I}_{1} f_{2}] (t) - [\hat{I}_{1} f_{2}] (t - τ_{1}) + [\hat{I}_{2} f^{o}] (t), 0, t \in [T, \infty), t < T,

\hat{F} (t) \mathchar 58 = ⎩ ⎨ ⎧ [\hat{I}_{1} f_{2}] (t) - [\hat{I}_{1} f_{2}] (t - τ_{1}) + [\hat{I}_{2} f^{o}] (t), 0, t \in [T, \infty), t < T,

\hat{G} (t) \mathchar 58 = {[\hat{I}_{2} σ] (t), 0, t \in [T, \infty), t < T,

\hat{G} (t) \mathchar 58 = {[\hat{I}_{2} σ] (t), 0, t \in [T, \infty), t < T,

f \mapsto \int_{t - τ_{2}}^{t} \int_{ζ_{2} - τ_{1}}^{ζ_{2}} f (\overset{x}{^} (ζ_{1}), u (ζ_{1})) d ζ_{1} d ζ_{2},

f \mapsto \int_{t - τ_{2}}^{t} \int_{ζ_{2} - τ_{1}}^{ζ_{2}} f (\overset{x}{^} (ζ_{1}), u (ζ_{1})) d ζ_{1} d ζ_{2},

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Online Simultaneous State and Parameter Estimation

Ryan Self, Moad Abudia, S M Nahid Mahmud, and Rushikesh Kamalapurkar The authors are with the School of Mechanical and Aerospace Engineering, Oklahoma State University, Stillwater, OK, USA.{rself, abudia, nahid.mahmud, rushikesh.kamalapurkar}@okstate.edu. This research was supported, in part, by the National Science Foundation (NSF) under award number 1925147, and the Air Force Research Laboratories (AFRL) under award number FA8651-19-2-0009. Any opinions, findings, conclusions, or recommendations detailed in this article are those of the author(s), and do not necessarily reflect the views of the sponsoring agencies.

Abstract

In this paper, a data-driven adaptive observer is developed for a class of linear and nonlinear time-invariant systems with uncertain dynamics. The developed state-observer is utilized to generate estimates of the state from input-output data. Using the estimated state trajectories, in addition to the known system inputs, a novel data-driven parameter estimation scheme is developed to achieve simultaneous state and parameter estimation, online, for both linear and nonlinear systems. The technique results in state and parameter estimation errors that are uniformly ultimately bounded to prescribed bounds near the origin. As opposed to persistent excitation, which is required for parameter convergence in traditional adaptive observers, the developed technique only requires excitation over a finite time interval. A sensitivity analysis and simulation results in both noise-free and noisy environments are presented to validate the design.

I Introduction

Traditional adaptive control methods handle uncertainty in the system dynamics by maintaining a parametric estimate of the model and utilizing it to generate a feedforward control signal (see, e.g., [1, 2, 3]). While the feedforward-feedback architecture guarantees stability of the closed-loop, the control law is not robust to disturbances, and seldom provides information regarding the quality of the estimated model (cf. [1] and [2]). While accurate parameter estimation can improve robustness and transient performance of adaptive controllers, (see, e.g., [4, 5, 6]), parameter convergence typically requires restrictive assumptions such as persistence of excitation (PE)[1, 7, 8, 9]. An excitation signal is often added to the controller to ensure persistence of excitation; however, the added signal can cause mechanical fatigue and compromise the tracking performance. Therefore, the development of techniques that facilitate parameter convergence without the requirement of PE is motivated.

Parameter convergence can be achieved under a finite excitation condition using data-driven methods, such as concurrent learning (CL) (see, e.g., [6, 10, 11]), where the parameters are estimated by storing data during time-intervals when the system is excited, and then utilizing the stored data to drive adaptation when excitation is unavailable. In addition to parameter estimation, CL adaptive control methods also possess similar robustness to bounded disturbances as $\sigma-$ modification, $e-$ modification, etc., without the associated drawbacks such as drawing the parameter estimates to arbitrary set-points [6, 12, 10, 11]. CL has been shown to be an effective tool for adaptive control (see, e.g., [6, 12, 10, 11]) and adaptive estimation (see, e.g., [13, 14, 15, 16, 17, 18]).

Adaptation techniques similar to the CL method are utilized to implement reinforcement learning under finite excitation conditions in results such as [13, 14, 15, 16, 17, 18]. CL methods have also been extended to classes of switched systems [19, 20], systems driven by stochastic processes [21], and systems with time-varying parameters [22]. A major drawback of CL methods is that they require numerical differentiation of the state measurements. CL methods that do not require numerical differentiation of the state measurements are developed in results such as [23] and [24], however, they require full state feedback. Since full state feedback is often not available, the development of an output-feedback CL framework is well-motivated. In order to achieve parameter convergence using output-feedback CL, access to simultaneous estimates of the unmeasurable states are required.

Due to advantageous properties such as the separation principle, there is a large body of literature on simultaneous state and parameter estimation for linear systems [25, 26, 27]. Estimation methods for linear systems typically use popular techniques, such as Kalman filters, because of their well-documented effectiveness. More recently, researchers have also explored the state and parameter estimation problem for nonlinear systems [28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38]. While tools such as particle filters [30], extended Kalman filters [31], multi-observers [32], and adaptive observers [33, 34, 35, 36, 37, 38] have been examined for nonlinear simultaneous state and parameter estimation, they either do not provide theoretical performance guarantees [30, 31] or require stringent assumptions [32, 33, 34, 35, 36, 37, 38], such as PE, which are generally difficult, if not impossible, to check online. While relaxed PE results are presented in [39, 40, 41], these results still require a persistent excitation condition. Therefore, this paper aims to provide both theoretical guarantees and finite (as opposed to persistent) excitation assumptions that are verifiable online.

In this paper, the preliminary results from [42] and [43] are consolidated and generalized to yield an output feedback concurrent learning method for simultaneous state and parameter estimation in uncertain linear and nonlinear systems. In particular, this paper yields a formal method for simultaneous state and parameter estimation for a broad class of dynamical systems that includes the Brunovsky canonical form studied in [42] and [43] as a special case. An adaptive state-observer is utilized to generate estimates of the state from input-output data. The estimated state trajectories along with the known inputs are then utilized in a novel data-driven parameter estimation scheme to achieve simultaneous state and parameter estimation. Convergence of the state estimates and the parameter estimates to a small neighborhood of the origin is established under a finite (as opposed to persistent) excitation condition.

The paper is organized as follows. In Section II, the class of nonlinear systems that the developed method applies to is described. An integral error system that facilitates parameter estimation is developed in Section II-B. Section II-C is dedicated to the design of a robust state observer. Section II-D details the developed parameter estimator. Section II-E details the algorithm for selection and storage of the data that is used to implement concurrent learning. Section II-F is dedicated to a Lyapunov-based analysis of the developed technique. In Section III, linear systems are considered. A linear error system is developed in Section III-B to facilitate CL-based adaptation. A CL-based parameter estimator is designed in Section III-C. A Lyapunov-based stability analysis of the parameter estimator is presented in Section III-D. Section IV demonstrates the efficacy of the developed method via a numerical simulation and Section V concludes the paper.

II Nonlinear Systems

II-A Problem Formulation

Consider a nonlinear system of the form

[TABLE]

where $x_{1}\in\mathbb{R}^{n_{1}}$ and $x_{2},x_{3}\in\mathbb{R}^{n_{2}}$ denote the state variables, $x\mathrel{\mathop{\mathchar 58\relax}}=\begin{bmatrix}x_{1}^{T}&x_{2}^{T}&x_{3}^{T}\end{bmatrix}^{T}$ is the system state, $f_{1}\mathrel{\mathop{\mathchar 58\relax}}\mathbb{R}^{n_{1}+n_{2}}\times\mathbb{R}^{m}\to\mathbb{R}^{n_{1}}$ and $f_{2}\mathrel{\mathop{\mathchar 58\relax}}\mathbb{R}^{n_{1}+n_{2}}\times\mathbb{R}^{m}\to\mathbb{R}^{n_{2}}$ are known and locally Lipschitz continuous, $f_{3}\mathrel{\mathop{\mathchar 58\relax}}\mathbb{R}^{n_{1}+2n_{2}}\times\mathbb{R}^{m}\to\mathbb{R}^{n_{2}}$ is locally Lipschitz continuous, $u\in\mathbb{R}^{m}$ is the controller, $y\in\mathbb{R}^{n_{1}+n_{2}}$ denotes the output, and $x^{-}\mathrel{\mathop{\mathchar 58\relax}}=\begin{bmatrix}x_{1}^{T}&x_{2}^{T}\end{bmatrix}^{T}$ denotes the measurable part of the system state. The model, $f_{3}$ , is comprised of a known nominal part and an unknown part, i.e., $f_{3}=f^{o}+g$ , where $f^{o}\mathrel{\mathop{\mathchar 58\relax}}\mathbb{R}^{n_{1}+2n_{2}}\times\mathbb{R}^{m}\to\mathbb{R}^{n_{2}}$ is known and locally Lipschitz and $g\mathrel{\mathop{\mathchar 58\relax}}\mathbb{R}^{n_{1}+2n_{2}}\times\mathbb{R}^{m}\to\mathbb{R}^{n_{2}}$ is unknown and locally Lipschitz. The objective is to design an adaptive estimator to identify the state, $x$ , and the unknown function, $g$ , online, using input-output measurements.

Systems of the form (1) encompass $N^{\text{th}}$ -order linear systems and Euler-Lagrange models with invertible inertia matrices, and hence, represent a wide class of physical plants, including but not limited to robotic manipulators and autonomous ground, aerial, and underwater vehicles.

Assumption 1.

A compact set $\chi\subseteq\mathbb{R}^{n_{1}+2n_{2}}\times\mathbb{R}^{m}$ such that $(x(t),u(t))\in\chi,\forall\ t\in\mathbb{R}_{\geq T_{0}}$ 111For $a\in\mathbb{R},$ the notation $\mathbb{R}_{\geq a}$ denotes the interval $\left[a,\infty\right)$ and the notation $\mathbb{R}_{>a}$ denotes the interval $\left(a,\infty\right)$ . and $\forall\ T_{0}\geq 0$ is known, where $T_{0}\in\mathbb{R}_{\geq 0}$ denotes the initial time.

*Remark 1**.*

The problem formulation in (1) incorporates commonly occurring dynamical systems described using the Brunovsky canonical form [44]

[TABLE]

and the extended Brunovsky form

[TABLE]

where $x_{1},x_{2},\ldots,x_{N}\in\mathbb{R}^{n}$ denote the state variables, $x\mathrel{\mathop{\mathchar 58\relax}}=\begin{bmatrix}x_{1}^{T}&x_{2}^{T}&\ldots&x_{N}^{T}\end{bmatrix}^{T}$ is the system state, $f_{1},f_{2},\ldots,f_{N-1}\mathrel{\mathop{\mathchar 58\relax}}\mathbb{R}^{(N-1)n}\times\mathbb{R}^{m}\to\mathbb{R}^{n}$ and $f\mathrel{\mathop{\mathchar 58\relax}}\mathbb{R}^{Nn}\times\mathbb{R}^{m}\to\mathbb{R}^{n}$ are locally Lipschitz continuous, $u\in\mathbb{R}^{m}$ is the controller, $y\in\mathbb{R}^{(N-1)n}$ denotes the output, and $x^{-}\mathrel{\mathop{\mathchar 58\relax}}=\begin{bmatrix}x_{1}^{T}&x_{2}^{T}&\ldots&x_{N-1}^{T}\end{bmatrix}^{T}$ denotes the measureable part of the system state.

II-B Error System for Estimation

Given a constant $\overline{\epsilon}\in\mathbb{R}_{>0}$ , there exist $p\in\mathbb{N}$ and $\overline{\sigma},\overline{\theta}\in\mathbb{R}_{>0}$ , such that the unknown function $g$ can be approximated, over the compact set $\chi$ , using basis functions $\sigma\mathrel{\mathop{\mathchar 58\relax}}\mathbb{R}^{n_{1}+2n_{2}}\times\mathbb{R}^{m}\to\mathbb{R}^{p}$ as $g\left(x,u\right)=\theta^{T}\sigma\left(x,u\right)+\epsilon\left(x,u\right)$ , where $\epsilon\mathrel{\mathop{\mathchar 58\relax}}\mathbb{R}^{n_{1}+2n_{2}}\times\mathbb{R}^{m}\to\mathbb{R}^{n_{2}}$ denotes the approximation error, $\theta\in\mathbb{R}^{p\times n_{2}}$ is a constant matrix of unknown parameters, and $\max_{\left(x,u\right)\in\chi}\left\|\sigma\left(x,u\right)\right\|<\overline{\sigma}$ , $\max_{\left(x,u\right)\in\chi}\left\|\nabla\sigma\left(x,u\right)\right\|<\overline{\sigma}$ , $\max_{\left(x,u\right)\in\chi}\left\|\epsilon\left(x,u\right)\right\|<\overline{\epsilon}$ , $\max_{\left(x,u\right)\in\chi}\left\|\nabla\epsilon\left(x,u\right)\right\|<\overline{\epsilon}$ , and $\left\|\theta\right\|<\overline{\theta}$ [45, 46]. To obtain an error signal for parameter identification, the system in (1) is expressed in the form

[TABLE]

Integrating (4) over the interval $\left[t-\tau_{1},t\right]$ for some constant $\tau_{1}\in\mathbb{R}_{>0}$ and then over the interval $\left[t-\tau_{2},t\right]$ for some constant $\tau_{2}\in\mathbb{R}_{>0}$ ,

[TABLE]

and $\mathcal{I}_{2}$ denotes the double integral operator

[TABLE]

Using the Fundamental Theorem of Calculus and the fact that $x_{3}(t)=\dot{x}_{2}(t)-f_{2}(x^{-}(t),u(t))$ , for almost all $t\in\mathbb{R}_{\geq T_{0}}$ ,

[TABLE]

and $\mathcal{I}_{1}$ denotes the single integral operator

[TABLE]

The expression in (7) can be rearranged to form the affine system

[TABLE]

where222The matrices $X,F,G$ , and $E$ are evaluated along the trajectories of (1), and as such, are functions of $T_{0}$ , $x(\cdot)$ and $u(\cdot)$ . Since the bound on $x(\cdot)$ and $u(\cdot)$ imposed by Assumption 1 is uniform in $T_{0}$ , the dependence of $X,F,G$ , and $E$ on $T_{0}$ is not relevant to the subsequent analysis, and as such, is not made explicit in the notation.

[TABLE]

The affine relationship in (7) is valid for all $t\in\mathbb{R}_{\geq T_{0}}$ ; however, it provides useful information about the vector $\theta$ only after $t\geq\mathscr{T}$ . In the following, (7) will be used to solve the simultaneous state and parameter estimation problem.

While (7) can be used to learn the unknown parameters, $\theta$ , knowledge of the state variable $x_{3}$ is required to compute the matrices $F$ and $G$ . A robust adaptive state estimator is developed in the following to generate estimates of $x_{3}$ .

II-C State Estimator Design

To generate estimates of $x_{3}$ , a state estimator inspired by [47] is developed. The estimator is given by

[TABLE]

where $\hat{x}_{1}$ , $\hat{x}_{2}$ , $\hat{x}_{3}$ , $\hat{x}$ , $\hat{x}^{-}$ , and $\hat{\theta}$ are estimates of $x_{1},$ $x_{2}$ , $x_{3}$ , $x$ , $x^{-}$ , and $\theta$ , respectively, and $\nu$ is a feedback term designed in the following.

To facilitate the design of $\nu$ , let the state and parameter estimation errors be defined as

[TABLE]

and define the model error as

[TABLE]

where $\alpha$ is a positive constant, and $\tilde{f}_{1}\left(x^{-},u,\hat{x}^{-}\right)\mathrel{\mathop{\mathchar 58\relax}}=f_{1}\left(x^{-},u\right)-f_{1}\left(\hat{x}^{-},u\right)$ . The feedback component $\nu$ is designed as

[TABLE]

where the signal $\eta$ is added to compensate for the fact that the state variable $x_{3}$ is not measurable. Based on the subsequent stability analysis, the signal $\eta$ is designed as the output of the dynamic filter

[TABLE]

where $k$ and $\beta$ are positive constants and the error signal $r$ is defined as

[TABLE]

and $\tilde{f}_{2}(x^{-},u,\hat{x}^{-})\mathrel{\mathop{\mathchar 58\relax}}={f}_{2}(x^{-},u)-{f}_{2}(\hat{x}^{-},u)$ .

Using integration by parts to eliminate the auxiliary variable $\zeta$ , the dynamic filter can be expressed in the equivalent form

[TABLE]

In the following, the filter in (II-C) is used for implementation and the filter in (18), which is not implementable due to its dependence on $\tilde{x}_{3}$ , is used for analysis.

Since $\left(x^{-},u\right)\mapsto f_{1}\left(x^{-},u\right)$ and $\left(x^{-},u\right)\mapsto f_{2}\left(x^{-},u\right)$ are locally Lipschitz, given a compact set $\tilde{\chi}\subset\chi\times\mathbb{R}^{n_{1}+2n_{2}}$ , Assumption 1 can be used to conclude that there exists an $L>0$ , independent of $T_{0}$ , such that

[TABLE]

and

[TABLE]

To generate the estimates $\hat{\theta}$ , a concurrent learning [48] technique that utilizes only the output measurements is developed, motivated by the affine error system in (7).

II-D Parameter Estimator Design

To obtain an output-feedback concurrent learning update law for the parameter estimates, a history stack, denoted by $\mathcal{H}$ , is utilized. A history stack is defined as a set of ordered pairs $\left\{\left(X_{i},\hat{F}_{i},\hat{G}_{i}\right)\right\}_{i=1}^{M}$ such that

[TABLE]

where $\mathcal{E}_{i}$ is a matrix with an induced $2$ -norm that is small enough in a sense that is made precise in the subsequent analysis. Typically, a history stack that satisfies (20) is not available a priori. The history stack is recorded online using the relationship in (7), by selecting an increasing set of time-instances $\left\{t_{i}\right\}_{i=1}^{M}$ (see Fig. 1) and letting

[TABLE]

where333The matrices $\hat{F}$ and $\hat{G}$ are evaluated along the trajectories of (12), (18), (25), and (26), and as such, depend on $T_{0}$ , $u(\cdot)$ , $x(\cdot)$ , $\hat{x}(T_{0})$ , and $\hat{\theta}(T_{0})$ . For brevity of notation, the matrices are denoted as functions of time.

[TABLE]

where $\mathcal{\hat{I}}_{2}$ denotes the double integral operator

[TABLE]

and $\mathcal{\hat{I}}_{1}$ denotes the single integral operator

[TABLE]

In this case, the error term $\mathcal{E}_{i}$ is given by $\mathcal{E}_{i}=E\left(t_{i}\right)+F\left(t_{i}\right)-\hat{F}\left(t_{i}\right)+\theta^{T}\left(G\left(t_{i}\right)-\hat{G}\left(t_{i}\right)\right).$ Let $\left[t_{1},t_{2}\right)\subseteq\mathbb{R}_{\geq\mathscr{T}}$ be an interval over which the history stack was recorded. Provided the states and the state estimation errors remain within the compact sets $\chi|_{x}$ and $\tilde{\chi}|_{\tilde{x}}$ , respectively,444 $\chi|_{x}\mathrel{\mathop{\mathchar 58\relax}}=\{x\in\mathbb{R}^{n_{1}+2n_{2}}|(x,u)\in\chi\}$ and $\tilde{\chi}|_{\tilde{x}}\mathrel{\mathop{\mathchar 58\relax}}=\{\tilde{x}\in\mathbb{R}^{n_{1}+2n_{2}}|(x,u,\tilde{x})\in\tilde{\chi}\}$ . over $I\mathrel{\mathop{\mathchar 58\relax}}=\left[t_{1}-\tau_{1}-\tau_{2},t_{2}\right)$ , the error terms can be bounded as

[TABLE]

where $\overline{\tilde{x}}_{I}\mathrel{\mathop{\mathchar 58\relax}}=\max_{i\in\left\{1,\cdots,M\right\}}\sup_{t\in I}\left\|\tilde{x}\left(t\right)\right\|$ and $L_{1},L_{2}>0$ are constants.

The concurrent learning update law to estimate the unknown parameters is designed as

[TABLE]

where $k_{\theta}\in\mathbb{R}_{>0}$ is a constant adaptation gain and $\Gamma\in\mathbb{R}^{p\times p}$ is the least-squares gain updated using the update law

[TABLE]

where the matrix ${\mathscr{G}}\in\mathbb{R}^{p\times p}$ is defined as ${\mathscr{G}}\mathrel{\mathop{\mathchar 58\relax}}=\sum_{i=1}^{M}\left(\frac{\hat{G}_{i}}{\sqrt{1+\kappa\|\hat{G}_{i}\|^{2}}}\right)\left(\frac{\hat{G}_{i}}{\sqrt{1+\kappa\|\hat{G}_{i}\|^{2}}}\right)^{T}$ and $\kappa\in\mathbb{R}_{>0}$ .

II-E Purging

The update law in (25) is motivated by the fact that if the full state were available for feedback and if the approximation error, $\epsilon$ , were zero, then using

[TABLE]

the parameters could be estimated via the least squares estimate

[TABLE]

However, since the history stack contains the estimated terms $\hat{F}$ and $\hat{G}$ , during the transient period where the state estimation error is large, the history stack does not accurately (within the error bound introduced by $\epsilon$ ) represent the system dynamics. Hence, the history stack needs to be purged whenever better estimates of the state are available.

Since the state estimator exponentially drives the estimation error to a small neighborhood of the origin, a newer estimate of the state can be assumed to be at least as good as an older estimate, apart from the small error introduced by practical stability of the estimator. This fact motivates the dwell time based greedy purging algorithm developed in the following to utilize newer data for estimation while preserving stability of the estimator.

The algorithm maintains two history stacks, a main history stack and a transient history stack, labeled $\mathcal{H}$ and $\mathcal{G}$ , respectively. As soon as the transient history stack is full and sufficient dwell time has passed, the main history stack is emptied and the transient history stack is copied into the main history stack. A lower bound on the required dwell time, denoted by $\mathcal{T}$ , is determined in Section II-F using a Lyapunov-based stability analysis.

Parameter identification in the developed framework requires a full rank history stack $\mathcal{H}$ , which is achieved provided the trajectories contain sufficient information, as quantified by the following assumption.

Assumption 2.

There exist $\underline{c},T>0$ such that for all $T_{0}\in\mathbb{R}_{\geq 0}$ , $\hat{x}(T_{0})\in\tilde{\chi}|_{\hat{x}}$ , $\hat{\theta}(T_{0})\in\mathbb{R}^{p}$ , and system trajectories $x\mathrel{\mathop{\mathchar 58\relax}}\mathbb{R}_{\geq T_{0}}\to\chi|_{x}$ in response to the controllers $u\mathrel{\mathop{\mathchar 58\relax}}\mathbb{R}_{\geq T_{0}}\to\chi|_{u}$ , there exist $M\in\mathbb{N}$ and time instances $T_{0}\leq t_{1}<t_{2}<\ldots<t_{M}\leq T$ , such that a history stack recorded using Fig. 1 satisfies

[TABLE]

where $\lambda_{\min}\left(\cdot\right)$ denotes the minimum singular value of a matrix.

*Remark 2**.*

Uniformity of excitation, with respect to initial conditions and the true state and control trajectories, is required for uniform stability of the estimator (cf. [39]). If uniformity of excitation cannot be guaranteed, then, as long as (29) holds for a specific set of initial conditions and state and control trajectories, the estimation error of the developed state and parameter estimator, starting from the given initial conditions and evaluated along the given true state and control trajectories, can be shown to be ultimately bounded using analysis techniques similar to Section II-F.

Motivated by the observation that the rate of decay of the parameter estimation errors is proportional to the minimum singular value of $\mathscr{G}$ , a singular value maximization algorithm is used to select the time instances $\left\{t_{i}\right\}_{i=1}^{M}$ . That is, a data-point $\left(X_{j},\hat{F}_{j},\hat{G}_{j}\right)$ in the history stack is replaced with a new data-point $\left(X^{*},\hat{F}^{*},\hat{G}^{*}\right)$ , where $\hat{F}^{*}=\hat{F}\left(t\right)$ , $X^{*}=X\left(t\right)$ , and $\hat{G}^{*}=\hat{G}\left(t\right)$ , for some $t$ , only if

[TABLE]

where $\lambda_{\min}\left(\cdot\right)$ denotes the minimum singular value of a matrix, $\psi$ is a tunable constant, $\mu_{i}=\frac{1}{1+\kappa\|G_{i}\|^{2}}$ , $\mu_{j}=\frac{1}{1+\kappa\|G_{j}\|^{2}}$ , and $\mu^{*}=\frac{1}{1+\kappa\|G^{*}\|^{2}}$ . To simplify the analysis, it is assumed that new data points are only collected $\tau_{1}+\tau_{2}$ seconds after a purging event. Since the history stack is updated using a singular value maximization algorithm, the matrix $\mathscr{G}$ is a piece-wise constant function of time with the property that once it satisfies (29), at some $t=T$ , and for some $\underline{c}$ , the condition $\underline{c}<\lambda_{\min}\left(\mathscr{G}\left(t\right)\right)$ holds for all $t\geq T$ . The developed purging algorithm is summarized in Fig. 1.

A Lyapunov-based analysis showing uniform ultimate boundedness of the parameter and the state estimation errors is presented in the following section.

II-F Analysis

Each purging event represents a discontinuous change in the system dynamics; hence, the resulting closed-loop system is a switched system. To facilitate the analysis of the switched system, let $\rho\mathrel{\mathop{\mathchar 58\relax}}\mathbb{R}_{\geq T_{0}}\to\mathbb{N}$ denote a switching signal such that $\rho\left(T_{0}\right)=1$ , and $\rho\left(t\right)=j+1$ , where $j$ denotes the number of times the update $\mathcal{H}\leftarrow\mathcal{G}$ was carried out over the time interval $\left(T_{0},t\right)$ . For a given $s\in\mathbb{N}$ , let $\mathcal{H}_{s}$ denote the history stack active during the time interval $\left\{t\mid\rho\left(t\right)=s\right\}$ , containing the elements $\left\{\left(X_{si},\hat{F}_{si},\hat{G}_{si}\right)\right\}_{i=1,\cdots,M}$ , and let $\mathcal{E}_{si}^{T}$ be the corresponding error term. To simplify the notation, let $\mathscr{G}_{s}\mathrel{\mathop{\mathchar 58\relax}}=\sum_{i=1}^{M}\frac{\hat{G}_{si}\hat{G}_{si}^{T}}{1+\kappa\|G_{si}\|^{2}}$ , and $Q_{s}\mathrel{\mathop{\mathchar 58\relax}}=\sum_{i=1}^{M}\frac{\hat{G}_{si}\mathcal{E}_{si}^{T}}{1+\kappa\|G_{si}\|^{2}}$ .

Using (20) and (25), the dynamics of the parameter estimation error can be expressed as

[TABLE]

Since the functions $\mathscr{G}_{s}\mathrel{\mathop{\mathchar 58\relax}}\mathbb{R}_{\geq T_{0}}\to\mathbb{R}^{p\times p}$ and $Q_{s}\mathrel{\mathop{\mathchar 58\relax}}\mathbb{R}_{\geq T_{0}}\to\mathbb{R}^{p\times n}$ are piece-wise continuous, the trajectories of (31), and of all the subsequent error systems involving $\mathscr{G}_{s}$ and and $Q_{s}$ , are defined in the sense of Carathéodory[49]. The algorithm in Fig. 1 ensures that there exists a constant $\underline{c}>0$ such that $\lambda_{\min}\left\{\mathscr{G}_{s}\right\}\geq\underline{c},\>\forall s\in\mathbb{N}$ .

Using the dynamics in (1), (12)

(18), and the design of the feedback component in (15), the evolution of the error signal $r$ is described by

[TABLE]

where $\tilde{\sigma}\left(x,u,\hat{x}\right)=\sigma\left(x,u\right)-\sigma\left(\hat{x},u\right)$ and $\tilde{f^{o}}\left(x,u,\hat{x}\right)=f\left(x,u\right)-f\left(\hat{x},u\right)$ . Since $\left(x,u\right)\mapsto f\left(x,u\right)$ and $\left(x,u\right)\mapsto\sigma\left(x,u\right)$ are locally Lipschitz, given a compact set $\tilde{\chi}\subset\chi\times\mathbb{R}^{n_{1}+2n_{2}}$ , Assumption 1 can be used to conclude that there exist $L_{f},L_{\sigma}>0$ , independent of $T_{0}$ , such that

[TABLE]

and

[TABLE]

To facilitate the analysis, let $\left\{T_{s}\in\mathbb{R}_{\geq 0}\mid s\in\mathbb{N}\right\}$ be a set of switching time instances defined as $T_{s}=\left\{t\!\mid\!\rho\left(\tau\right)\!<s+1,\forall\tau\!\in\!\left[T_{0},t\right)\land\rho\left(\tau\right)\!\geq\!s+1,\forall\tau\!\in\!\left[t,\infty\right)\right\}.$ That is, for a given switching index $s,$ $T_{s}$ denotes the time instance when the $\left(s+1\right)$ th subsystem is switched on. The analysis is carried out separately over the time intervals $\left[T_{s-1},T_{s}\right)$ , $s\in\mathbb{N}$ , where $T_{1}\geq T_{0}+\tau_{1}+\tau_{2}+t_{M}$ . Since the history stack $\mathcal{H}$ is not updated over the intervals $\left[T_{s-1},T_{s}\right)$ , $s\in\mathbb{N}$ , the matrices $\mathscr{G}_{s}$ and $Q_{s}$ are constant over each interval. The history stack that is active over the interval $\left[T_{s},T_{s+1}\right)$ is denoted by $\mathcal{H}_{s+1}$ . To ensure boundedness of the trajectories in the interval $t\in\left[T_{0},T_{1}\right)$ , the history stack $\mathcal{H}_{1}$ is computed using arbitrarily selected trajectories $x(\cdot),\hat{x}(\cdot),u(\cdot)$ that are confined within $\tilde{\chi}$ and make $\mathcal{H}_{1}$ full rank555Arbitrary selection of $\mathcal{H}_{1}$ results in potentially large initial error $\mathcal{E}_{1}$ in (20). While large $\mathcal{E}_{1}$ could potentially result in large parameter estimation errors, $\tilde{\theta}$ , during $[T_{0},T_{1})$ , as long as $\mathcal{H}_{1}$ is full rank, the first term in (31) ensures that $\tilde{\theta}$ remains bounded over $[T_{0},T_{1})$ .. The analysis is carried out over the aforementioned intervals using the state vectors $Z\mathrel{\mathop{\mathchar 58\relax}}=\left[\begin{array}[]{cccccc}\tilde{x}_{1}^{T}&\tilde{x}_{2}^{T}&r^{T}&\eta^{T}&\mbox{vec}\left(\tilde{\theta}\right)^{T}\end{array}\right]^{T}\in\mathbb{R}^{n_{1}+3n_{2}+p}$ and $Y\mathrel{\mathop{\mathchar 58\relax}}=\begin{bmatrix}\tilde{x}_{1}^{T}&\tilde{x}_{2}^{T}&r^{T}&\eta^{T}\end{bmatrix}^{T}\in\mathbb{R}^{n_{1}+3n_{2}}$ .

A summary of the stability analysis is provided in the following, along with a graphical representation in Fig. 2.

Interval 1: First, it is established that $Z$ is bounded over $\left[T_{0},T_{1}\right)$ , where the bound is $O\left(\left\|Z\left(T_{0}\right)\right\|+\left\|\sum_{i=1}^{M}\mathcal{E}_{1i}\right\|+\overline{\epsilon}\!\right)$ 666 $f\in O\left(g\right)$ denotes that there exists $c,M\in\mathbb{R}_{>0}$ such that $|f(x)|\leq c|g(x)|\ \forall\ x>M$ .. Then, for a given $\varepsilon\in\mathbb{R}_{>0}$ , the bound on $Z$ is utilized to select state estimator gains such that $\left\|Y\left(T_{1}\right)\right\|<\varepsilon$ .

Interval 2: The history stack $\mathcal{H}_{2}$ , which is active over $\left[T_{1},T_{2}\right)$ , is recorded over $\left[T_{0},T_{1}\right)$ . Without loss of generality, it can be ensured that $\mathcal{H}_{2}$ represents the system better than $\mathcal{H}_{1}$ (which is arbitrarily selected), that is, $\left\|\sum_{i=1}^{M}\mathcal{E}_{1i}\right\|\geq\left\|\sum_{i=1}^{M}\mathcal{E}_{2i}\right\|$ . The bound on $Z$ over $\left[T_{1},T_{2}\right)$ is then shown to be smaller than that over $\left[T_{0},T_{1}\right)$ , which is utilized to show that $\left\|Y\left(t\right)\right\|\leq\varepsilon,$ for all $t\in\left[T_{1},T_{2}\right)$ .

*Interval 3: *Using (24), the errors $\mathcal{E}_{3i}$ are shown to be $O\left(\left\|Y_{3i}\right\|+\overline{\epsilon}\right)$ where $Y_{3i}$ denotes the value of $Y$ at the time when the point $\left(X_{3i},\hat{F}_{3i},\hat{G}_{3i}\right)$ was recorded. Using the facts that the history stack $\mathcal{H}_{3}$ , which is active over $\left[T_{2},T_{3}\right)$ , is recorded over $\left[T_{1},T_{2}\right)$ and $\left\|Y\left(t\right)\right\|\leq\varepsilon,$ for all $t\in\left[T_{1},T_{2}\right)$ , the error $\left\|\sum_{i=1}^{M}\mathcal{E}_{3i}\right\|$ is shown to be $O\left(\varepsilon+\overline{\epsilon}\right)$ . If $T_{3}=\infty$ then it is established that $\lim\sup_{t\to\infty}\left\|Z\left(t\right)\right\|=O\left(\varepsilon+\overline{\epsilon}\right)$ . If $T_{3}<\infty$ then the fact that the bound on $Z$ over $\left[T_{2},T_{3}\right)$ is smaller than that over $\left[T_{1},T_{2}\right)$ is utilized to show that $\left\|Y\left(t\right)\right\|\leq\varepsilon,$ for all $t\in\left[T_{2},T_{3}\right)$ . The analysis is then continued in an inductive argument to show that $\lim\sup_{t\to\infty}\left\|Z\left(t\right)\right\|=O\left(\varepsilon+\overline{\epsilon}\right)$ and $\left\|Y\left(t\right)\right\|\leq\varepsilon,$ for all $t\in\left[T_{2},\infty\right)$ .

The stability result is summarized in the following theorem.

Theorem 1.

Let $\varepsilon>0$ be given. If Assumptions 1 and 2 hold, the history stacks $\mathcal{H}$ and $\mathcal{G}$ are populated using the algorithm detailed in Fig. 1, the learning gains selected to satisfy the sufficient gain conditions in (38), (39), (44), and (48), there exists a time instance $T\in\mathbb{R}_{>0}$ such that the system states are informative over $\left[T_{0},T\right]$ , that is, the history stack can be replenished if purged at any time $t\in\left[T_{0},T\right]$ , over each switching interval $\left\{t\mid\rho\left(t\right)=s\right\}$ , let the dwell-time, $\mathcal{T}$ , is selected such that $\mathcal{T}\left(t\right)=\mathcal{T}_{s}$ , where $\mathcal{T}_{s}$ is selected to be large enough to satisfy (47), and if the excitation interval is large enough so that $T_{2}<T$ ,777A minimum of two purges are required to remove the randomly initialized data, and the data recorded during transient phase of the derivative estimator from the history stack. then $\lim\sup_{t\to\infty}\left\|Z\left(t\right)\right\|=O\left(\varepsilon+\overline{\epsilon}\right)$ .

Proof.

Consider the candidate Lyapunov function

[TABLE]

Using arguments similar to [1, Corollary 4.3.2], it can be shown that provided $\lambda_{\min}\left\{\Gamma^{-1}\left(T_{0}\right)\right\}>0$ and Assumption 2 holds, the least squares gain matrix satisfies

[TABLE]

where $\underline{\Gamma}$ and $\overline{\Gamma}$ are positive constants, and $\text{I}_{n}$ denotes an $n\times n$ identity matrix.

The bound in (35) implies that the candidate Lyapunov function satisfies

[TABLE]

where $\overline{v}\mathrel{\mathop{\mathchar 58\relax}}=\frac{1}{2}\max\left\{1,\alpha^{2},\nicefrac{{1}}{{\underline{\Gamma}}}\right\}$ and $\underline{v}\mathrel{\mathop{\mathchar 58\relax}}=\frac{1}{2}\min\left\{1,\alpha^{2},\nicefrac{{1}}{{\overline{\Gamma}}}\right\}$ . For brevity, function dependencies will be omitted over the rest of the analysis.

Over the time interval $[T_{s-1},T_{s})$ , the orbital derivative of $V$ is given by888 $\dot{V}\left(Z,t\right)\mathrel{\mathop{\mathchar 58\relax}}=\frac{\partial V}{\partial Z}\left(Z,t\right)h_{Z}\left(Z,t\right)+\frac{\partial V}{\partial t}\left(Z,t\right)$ where $h_{Z}\mathrel{\mathop{\mathchar 58\relax}}\mathbb{R}^{n_{1}+3n_{2}+p}\times\mathbb{R}_{\geq T_{0}}\to\mathbb{R}^{n_{1}+3n_{2}+p}$ is constructed using (18), (17), (26), (31), and (32) so that $\dot{Z}=h_{Z}\left(Z,t\right)$ .

[TABLE]

Assuming that $\mathcal{H}_{s}$ was computed using values of $\hat{x}$ that correspond to trajectories that stay inside $\tilde{\chi}$ , the orbital derivative can be bounded by

[TABLE]

where $\underline{a}=k_{\theta}\underline{c}+\frac{\beta_{1}}{\overline{\Gamma}}$ , $\overline{Q}_{s}$ is a positive constant such that $\overline{Q}_{s}\geq\left\|Q_{s}\right\|$ , and the bounds $L,L_{f},L_{\sigma},\overline{\epsilon},$ and $\overline{\sigma}$ depend on the compact set $\tilde{\chi}$ . Provided

[TABLE]

then (37) simplifies to

[TABLE]

Since $\left\|\tilde{x}\right\|^{2}\leq\left\|Z\right\|^{2},$ $\dot{V}_{s}\leq-v\left(\left\|Z\right\|^{2}-\frac{\iota_{s}}{v}\right)$ in the domain

[TABLE]

That is, $\dot{V}_{s}$ is negative definite on $\mathcal{D}$ provided $\mathcal{H}_{s}$ was computed using values of $\hat{x}$ that correspond to trajectories that stay inside $\tilde{\chi}$ , and provided $\left\|Z\right\|>\sqrt{\frac{\iota_{s}}{v}}>0$ , where

[TABLE]

and $\iota_{s}\mathrel{\mathop{\mathchar 58\relax}}=\frac{\overline{\epsilon}^{2}}{2k}+\frac{4k_{\theta}^{2}}{\underline{a}}\overline{Q}_{s}^{2}$ . Theorem 4.18 from [50] can be invoked to conclude that provided the gain condition

[TABLE]

holds, where $\overline{V}_{s}\geq\left\|V\left(Z\left(T_{s-1}\right),T_{s-1}\right)\right\|$ is a constant, then $\dot{V}_{s}\left(Z\left(t\right),t\right)\leq-\frac{v}{\overline{v}}V_{s}\left(Z\left(t\right),t\right)+\iota_{s},$ $\forall t\in\left[T_{s-1},T_{s}\right)$ .

In particular, by initializing $\mathcal{H}_{1}$ using arbitrary values of $\hat{x}$ that satisfy $x-\hat{x}\in\tilde{\chi}|_{\tilde{x}}$ for all $x\in\chi|_{x}$ , it can be concluded that $\forall t\in\left[T_{0},T_{1}\right),$

[TABLE]

where $\overline{V}_{1}>0$ is a constant such that $\left|V\left(Z\left(T_{0}\right),T_{0}\right)\right|\leq\overline{V}_{1}$ . Using the relationships in (36) and (40), it can further be concluded $\forall t\in\left[T_{0},T_{1}\right),$

[TABLE]

If it were possible to use the inequality in (40) to conclude that $V\left(Z\left(t\right),t\right)\leq V\left(Z\left(T_{0}\right),T_{0}\right)$ , then an inductive argument could be used to show that the trajectories decay to a neighborhood of the origin. However, unless the history stack can be selected to have arbitrarily large minimum singular value (which is generally not possible), the constant $\frac{\overline{v}}{v}\iota_{1}$ cannot be made arbitrarily small using the learning gains.

Since $\iota_{s}$ depends on $Q_{s}$ , it can be made smaller by reducing the estimation errors and thereby reducing the errors associated with the data stored in the history stack. To that end, consider the candidate Lyapunov function

[TABLE]

The candidate Lyapunov function satisfies

[TABLE]

where $\overline{w}\mathrel{\mathop{\mathchar 58\relax}}=\frac{1}{2}\max\left\{1,\alpha^{2}\right\}$ , $\underline{w}\mathrel{\mathop{\mathchar 58\relax}}=\frac{1}{2}\min\left\{1,\alpha^{2}\right\}$ .

The orbital derivative of $W$ is given by999 $\dot{W}\left(Y,t\right)\mathrel{\mathop{\mathchar 58\relax}}=\frac{\partial V}{\partial Y}\left(Y,t\right)h_{Y}\left(Y,t\right)+\frac{\partial V}{\partial t}\left(Y,t\right)$ where $h_{Y}\mathrel{\mathop{\mathchar 58\relax}}\mathbb{R}^{n_{1}+3n_{2}}\times\mathbb{R}_{\geq T_{0}}\to\mathbb{R}^{n_{1}+3n_{2}}$ is constructed using (18), (17), and (32) so that $\dot{Y}=h_{Y}\left(Y,t\right)$ .

[TABLE]

If $\tilde{\theta}(t)$ is bounded over $[T_{s-1},T_{s})$ , then using Cauchy-Schwartz inequality, the orbital derivative can be simplified and bounded over $[T_{s-1},T_{s})$ as

[TABLE]

where $\theta_{s}>0$ is a constant such that

[TABLE]

In particular, consider the time interval $\left[T_{0},T_{1}\right)$ . Using the fact that $\tilde{\theta}(t)$ is bounded over $t\in[T_{0},T_{1})$ , provided

[TABLE]

then the time-derivative of $W$ over $[T_{0},T_{1})$ can be bounded as

[TABLE]

and $\gamma=\frac{(\theta_{1}\overline{\sigma}+\overline{\epsilon})^{2}}{2k}$ . That is, for all $t\in\left[T_{0},T_{1}\right),$

[TABLE]

where $\overline{W}_{1}>0$ is a constant such that $\left|W\left(Y\left(T_{0}\right)\right)\right|\leq\overline{W}_{1}$ . In particular, $\forall t\in\left[T_{0},T_{1}\right).$

[TABLE]

Provided the dwell time $\mathcal{T}_{1}$ is large enough so that

[TABLE]

then from (40) and (45), $W\left(Y\left(T_{1}\right)\right)\leq\frac{2\overline{w}\gamma}{w}$ and $V\left(Z\left(T_{1}\right),T_{1}\right)\leq\frac{2\overline{v}\iota_{1}}{v}$ . In particular, $\left\|Y\left(T_{1}\right)\right\|\leq\sqrt{\frac{2\overline{w}^{2}\gamma}{\underline{w}w}}$ and $\left\|Z\left(T_{1}\right)\right\|\leq\sqrt{\frac{2\overline{v}^{2}\iota_{1}}{\underline{v}v}}$ . Note that the bound on $Y\left(T_{1}\right)$ can be made arbitrarily small by increasing $k,$ $\alpha,$ and $\beta$ .

Now the interval $\left[T_{1},T_{2}\right)$ is considered. Given any arbitrary bound $\overline{W}_{1}$ , a compact set $\tilde{\chi}$ , and the learning gains that satisfy the resulting gain conditions in (38), (39), and (44), can be selected such that $\overline{\mathrm{B}}(0,\overline{Y}_{1})$ 101010 $\overline{\mathrm{B}}(0,\overline{Y})$ denotes the closed ball of radius $\overline{Y}$ around the origin. $\subseteq\tilde{\chi}$ , and as a result from (46) it follows that $\tilde{x}(t)\in\tilde{\chi}|_{\tilde{x}}$ for all $t\in[T_{0},T_{1})$ . Since the history stack $\mathcal{H}_{2}$ , which is active during $\left[T_{1},T_{2}\right)$ , is recorded during $\left[T_{0},T_{1}\right)$ , the bound in (24) can be used to show that $\overline{Q}_{2}=O\left(\overline{Y}_{1}+\overline{\epsilon}\right)$ .

Since $\mathcal{H}_{1}$ is independent of the system trajectories, $\overline{Q}_{1}$ can be selected, without loss of generality, such that $\overline{Q}_{2}<\overline{Q}_{1}$ , and hence, $\iota_{2}<\iota_{1}$ . Thus, provided the constant $\overline{V}_{1}$ (and as a result, the gain $k$ ) is selected large enough so that

[TABLE]

the gain condition in (39) holds over $\left[T_{1},T_{2}\right)$ , and hence, a similar Lyapunov-based analysis, along with the bound $\overline{V}_{2}=\frac{2\overline{v}\iota_{1}}{v}$ can be utilized to conclude that $\forall t\in\left[T_{1},T_{2}\right)$ ,

[TABLE]

The sufficient condition in (48) implies that $\overline{V}_{2}<\overline{V}_{1}$ and hence, (41) and $\iota_{2}<\iota_{1}$ imply that $\theta_{2}<\theta_{1}$ .

Since $\theta_{2}<\theta_{1}$ , the gain conditions in (44) hold over the interval $\left[T_{1},T_{2}\right)$ . A Lyapunov-based analysis similar to (42)-(46) yields $\left\|Y\left(t\right)\right\|\leq\sqrt{\frac{\overline{w}}{\underline{w}}\max\left(\overline{W}_{2},\frac{\overline{w}}{w}\gamma\right)}.$ From (47), $\overline{W}_{2}=\frac{2\overline{w}\gamma}{w}$ , and hence, $\forall t\in\left[T_{1},T_{2}\right)$ ,

[TABLE]

Now, the interval $\left[T_{2},T_{3}\right)$ is considered. By selecting $\overline{W}_{1}$ large enough, it can be ensured that $\overline{Y}_{2}<\overline{Y}_{1}$ , and as a result, $\tilde{x}(t)\in\tilde{\chi}|_{\tilde{x}},\forall t\in[T_{1},T_{2})$ . Since the history stack $\mathcal{H}_{3}$ , which is active during $\left[T_{2},T_{3}\right)$ , is recorded during $\left[T_{1},T_{2}\right)$ , the bounds in (24) and (50) can be used to show that $\overline{Q}_{3}=O\left(\overline{Y}_{2}+\overline{\epsilon}\right).$ Since $\overline{Y}_{2}<\overline{Y}_{1}$ , it follows that $\overline{Q}_{3}<\overline{Q}_{2}$ , which implies $\iota_{3}<\iota_{2}$ . Provided $\mathcal{T}_{2}$ satisfies (47), then $\left(\overline{V}_{2}-\frac{\overline{v}}{v}\iota_{2}\right)\text{e}^{-\frac{v}{\overline{v}}\left(T_{2}-T_{1}\right)}\leq\frac{\overline{v}}{v}\iota_{2}$ , which implies $\overline{V}_{3}=\frac{2\overline{v}}{v}\iota_{2}$ , and hence, $\overline{V}_{3}<\overline{V}_{2}$ and $\theta_{3}<\theta_{2}$ . Therefore, the gain conditions in (38), (39), and (44) are satisfied over $\left[T_{2},T_{3}\right)$ .

Since the gain conditions are satisfied, a Lyapunov-based analysis similar to (42) - (46) yields $\left\|Y\left(t\right)\right\|\leq\sqrt{\frac{2\overline{w}^{2}\gamma}{\underline{w}w}},\forall t\in\left[T_{2},T_{3}\right)$ . Given any $\varepsilon>0,$ the gains $\alpha,$ $\beta,$ and $k$ can be selected large enough to satisfy $\overline{Y}_{2}\leq\varepsilon,$ and hence, $\left\|Y\left(t\right)\right\|\leq\varepsilon,\forall t\in\left[T_{2},T_{3}\right).$ Furthermore, a similar Lyapunov-based analysis as (34)

(40) yields $V\left(Z\left(t\right),t\right)\leq\left(\overline{V}_{3}-\frac{\overline{v}}{v}\iota_{3}\right)\text{e}^{-\frac{v}{\overline{v}}\left(t-T_{2}\right)}+\frac{\overline{v}}{v}\iota_{3},\forall t\in\left[T_{2},T_{3}\right)$ . If $T_{3}=\infty$ then $\lim\sup_{t\to\infty}V\left(Z\left(t\right),t\right)\leq\frac{2\overline{v}}{v}\iota_{3}$ , which, from $\overline{Q}_{3}=O\left(\overline{Y}_{2}+\overline{\epsilon}\right)$ and $\iota_{3}=\frac{\overline{\epsilon}^{2}}{2k}+\frac{2k_{\theta}^{2}}{\underline{a}}\overline{Q}_{3}^{2}$ implies that $\lim\sup_{t\to\infty}\left\|Z\left(t\right)\right\|=O\left(\varepsilon+\overline{\epsilon}\right)$ .

If $T_{3}\neq\infty$ then an inductive continuation of the Lyapunov-based analysis to the time intervals $\left[T_{s-1},T_{s}\right)$ shows that provided the dwell time $\mathcal{T}_{s}$ satisfies (47), then the gain conditions in (38), (39), and (44) are satisfied for all $t>T_{3}$ , the state $Y$ satisfies

[TABLE]

$\tilde{x}(t)\in\tilde{\chi}|_{\tilde{x}},\forall\ t\geq T_{0}$ , and $Q_{s}\leq Q_{s-1}$ , $\iota_{s}\leq\iota_{s-1}$ , $\overline{V}_{s}\leq\overline{V}_{s-1}$ , and ${\theta}_{s}\leq{\theta}_{s-1}$ , for all $s>3$ .

The bound in (51) and the fact that $\overline{Q}_{s}\!=\!O\!\left(\overline{Y}_{s-1}+\overline{\epsilon}\right)$ indicate that $\overline{Q}_{s}=O\left(\varepsilon+\overline{\epsilon}\right),\forall s\in\mathbb{N}$ . Furthermore, $V\left(Z\left(t\right),t\right)\leq\left(\overline{V}_{s}-\frac{\overline{v}}{v}\iota_{s}\right)\text{e}^{-\frac{v}{\overline{v}}\left(t-T_{s-1}\right)}+\frac{\overline{v}}{v}\iota_{s}$ , $\forall t\in\left[T_{s-1},T_{s}\right)$ , $\forall s\in\mathbb{N}$ , which, along with the dwell time requirement, implies that $\lim\sup_{t\to\infty}V\left(Z\left(t\right),t\right)\leq\frac{2\overline{v}}{v}\iota_{s}$ , and hence, $\lim\sup_{t\to\infty}\left\|Z\left(t\right)\right\|=O\left(\varepsilon+\overline{\epsilon}\right)$ .

∎

III Linear Systems

When the system under consideration is linear, parameter estimation can be directly achieved using measurements of $x_{1}$ and without using state estimation. The following section details an output-feedback parameter estimator using $x_{1}$ as the output. The accompanying state estimator for linear systems is a trivial application of the estimator in Section II-C, and has been omitted.

III-A Problem Formulation

Consider a linear system of the form

[TABLE]

where $x_{1},x_{2},\ldots,x_{N}\in\mathbb{R}^{n}$ denote the state variables, $x\mathrel{\mathop{\mathchar 58\relax}}=\begin{bmatrix}x_{1}^{T}&x_{2}^{T}&\ldots&x_{N}^{T}\end{bmatrix}^{T}$ is the system state, $u\in\mathbb{R}^{m}$ is the controller, $A\in\mathbb{R}^{n\times Nn}$ and $B\in\mathbb{R}^{n\times m}$ denote the system matrices, and $y\in\mathbb{R}^{n}$ denotes the output. The objective is to design an adaptive estimator to identify the unknown matrices $A$ and $B$ , online, using input-output measurements.

III-B Error System for Estimation

To obtain an error signal for parameter identification, the system in (52) is expressed in the form

[TABLE]

where $A_{1}\in\mathbb{R}^{n\times n}$ , $A_{2}\in\mathbb{R}^{n\times n}$ , $\ldots$ , and $A_{N}\in\mathbb{R}^{n\times n}$ are constant matrices such that $A=\begin{bmatrix}A_{1}&A_{2}&\ldots&A_{N}\end{bmatrix}$ . Integrating (53) over the interval $\left[t-\tau_{1},t\right]$ for some constant $\tau_{1}\in\mathbb{R}_{>0}$ ,

[TABLE]

Integrating again over the interval $\left[t-\tau_{2},t\right]$ for some constant $\tau_{2}\in\mathbb{R}_{>0}$ ,

[TABLE]

Using the Fundamental Theorem of Calculus and the fact that $x_{N}\left(t\right)=\dot{x}_{N-1}\left(t\right)$ ,

[TABLE]

Repeating this process $N-1$ more time, results in

[TABLE]

where

[TABLE]

$\vdots$

[TABLE]

and $\mathscr{T}=T_{0}+\tau_{1}+\ldots+\tau_{N}$ . As opposed to nonlinear systems in Section II-B, where measurements of all states but the final state are required for parameter estimation, the integral form in (56) is independent of the state variables $x_{2},\ldots,x_{N}$ , and depends only on the output, $y=x_{1}$ . The expression in (56) can be rearranged to form the linear error system

[TABLE]

In (61), $\theta$ is a vector of unknown parameters, defined as $\theta\mathrel{\mathop{\mathchar 58\relax}}=\Big{[}\text{vec}\left(A_{1}\right)^{T}\ \text{vec}\left(A_{2}\right)^{T}$ $\ldots\ \text{vec}\left(A_{N}\right)^{T}\ \text{vec}\left(B\right)^{T}\Big{]}^{T}\in\mathbb{R}^{Nn^{2}+mn}$ , where $\text{vec}\left(\cdot\right)$ denotes the vectorization operator and the matrices $\mathcal{F}\mathrel{\mathop{\mathchar 58\relax}}\mathbb{R}_{\geq 0}\to\mathbb{R}^{n}$ and $\mathcal{G}\mathrel{\mathop{\mathchar 58\relax}}\mathbb{R}_{\geq 0}\to\mathbb{R}^{n\times\left(Nn^{2}+mn\right)}$ are defined as

[TABLE]

where $\text{I}_{n}$ denotes an $n\times n$ identity matrix, and $\otimes$ denotes the Kronecker product. Note that even though the linear relationship in (61) is valid for all $t\in\mathbb{R}_{\geq T_{0}},$ it provides useful information about the vector $\theta$ only after $t\geq\mathscr{T}$ .

The linear error system in (61) motivates the adaptive estimation scheme that follows.

III-C Parameter Estimator Design

To obtain output-feedback concurrent learning update law for the parameter estimates, a history stack denoted by $\mathcal{H}$ is utilized. The history stack is a set of ordered pairs $\left\{\left(\mathcal{F}_{i},\mathcal{G}_{i}\right)\right\}_{i=1}^{M}$ such that

[TABLE]

Note that $\mathcal{E}_{i}$ from (20) is absent from (62), since there are no estimated state variables in $\mathcal{F}_{i}$ or $\mathcal{G}_{i}$ .

If a history stack that satisfies (63) is not available a priori, it can be recorded online, using the relationship in (61), by selecting a set of time-instances $\left\{t_{i}\right\}_{i=1}^{M}$ and letting

[TABLE]

Furthermore, a singular value maximization algorithm is used to select the time instances $\left\{t_{i}\right\}_{i=1}^{M}$ . That is, a data-point $\left\{\left(\mathcal{F}_{j},\mathcal{G}_{j}\right)\right\}$ in the history stack is replaced by a new data-point $\left\{\left(\mathcal{F}^{*},\mathcal{G}^{*}\right)\right\}$ , where $\mathcal{F}^{*}=\mathcal{F}\left(t\right)$ and $\mathcal{G}^{*}=\mathcal{G}\left(t\right)$ , for some $t$ , only if

[TABLE]

where $\lambda_{\min}\left\{\cdot\right\}$ denotes the minimum Eigenvalue of a matrix.

Since the time instances, $\{t_{i}\}_{i=1}^{M}$ , vary according to the minimum singular value maximization algorithm, the history stacks, $\mathcal{F}(t)$ and $\mathcal{G}(t)$ , are time-varying and piece-wise constant. The following definition establishes a uniform lower bound for the time-varying history stacks to facilitate the analysis that directly follows.

Definition 1.

A history stack $\left\{\left(\mathcal{F}_{i},\mathcal{G}_{i}\right)\right\}_{i=1}^{M}$ is called *uniformly full rank *if there exists a constant $\underline{c}\in\mathbb{R}$ such that

[TABLE]

where the matrix $\mathscr{G}\in\mathbb{R}^{\left(Nn^{2}+mn\right)\times\left(Nn^{2}+mn\right)}$ is defined as $\mathscr{G}\mathrel{\mathop{\mathchar 58\relax}}=\sum_{i=1}^{M}\mathcal{G}_{i}^{T}\mathcal{G}_{i}$ .

The concurrent learning update law to estimate the unknown parameters is then given by

[TABLE]

and the least square update law is

[TABLE]

*Remark 3**.*

To facilitate the following Lyapunov analysis, using (61) and (65), the parameter estimation error can be expressed as

[TABLE]

Since the function $\mathscr{G}\mathrel{\mathop{\mathchar 58\relax}}\mathbb{R}_{\geq T_{0}}\to\mathbb{R}^{\left(Nn^{2}+mn\right)\times\left(Nn^{2}+mn\right)}$ is piece-wise continuous, the trajectories of (67) and all the subsequent functions involving $\mathscr{G}$ , are defined in the sense of Carathéodory[49].

III-D Analysis

The following theorem establishes exponential convergence of the parameter estimates.

Theorem 2.

If there exists a time $T$ such that the history stack $\left\{\left(\mathcal{F}_{i}(T),\mathcal{G}_{i}(T)\right)\right\}_{i=1}^{M}$ is uniformly full rank, then the parameter estimates, $\hat{\theta}$ , updated using the parameter estimator in (65), converge to $\theta^{*}$ , exponentially over the interval $[T,\infty)$ .

Proof.

Consider the following positive definite candidate Lyapunov function

[TABLE]

Using arguments similar to [1, Corollary 4.3.2], it can be shown that provided $\lambda_{\min}\left\{\Gamma^{-1}\left(T_{0}\right)\right\}>0$ and Assumption 2 holds, the least squares gain matrix satisfies

[TABLE]

The candidate Lyapunov function satisfies

[TABLE]

where (69) implies that the the bounds, $\overline{\Gamma}$ and $\underline{\Gamma}$ , in (70) are established independent of $T_{0}$ .

Using (65) and (66), along with the identity $\dot{\Gamma}^{-1}=-\Gamma^{-1}\dot{\Gamma}\Gamma^{-1}$ , the time-derivative of (68) results in111111 $\dot{V}\left(\tilde{\theta},t\right)\mathrel{\mathop{\mathchar 58\relax}}=\frac{\partial V}{\partial\tilde{\theta}}\left(\tilde{\theta},t\right)h_{\theta}\left(\tilde{\theta},t\right)+\frac{\partial V}{\partial t}\left(\tilde{\theta},t\right)$ where $h_{\theta}\mathrel{\mathop{\mathchar 58\relax}}\mathbb{R}^{Nn^{2}+mn}\times\mathbb{R}_{\geq T_{0}}\to\mathbb{R}^{Nn^{2}+mn}$ is constructed using (61) and (65) so that $\dot{\tilde{\theta}}=h_{\theta}\left(\tilde{\theta},t\right)$ .

[TABLE]

Simplifying (71), $\dot{V}(\tilde{\theta},t)$ becomes

[TABLE]

During the time interval $\left[T_{0},T\right]$ , when $\mathscr{G}$ is not full rank, Theorem 4.8 from [50] can be used to show uniform boundedness of $\tilde{\theta}$ . Once the history stack becomes full rank in the sense of Def. 1, using (68) and (72), along with the bounds in (64) and (69), Theorem 4.10 from [50] can be invoked to conclude that $\tilde{\theta}$ converges to the origin, exponentially over the interval $[T,\infty)$ . ∎

IV Simulation

IV-A Linear System

The linear system selected for the simulation study is given by

[TABLE]

To satisfy Assumption 1, a controller that results in a uniformly bounded system response is needed. In this simulation study, the controller, $u$ , is selected to be a PD controller of the form $u=-k_{p}\left(x_{1}-x_{d1}\right)-k_{d1}\left({x}_{2}-\dot{x}_{d1}\right)-k_{d2}\left({x}_{3}-\ddot{x}_{d1}\right)$ so that the system tracks the trajectory $x_{d1_{1}}\left(t\right)=x_{d1_{2}}\left(t\right)=-\frac{1}{3}\cos(3t)-\frac{1}{2}\cos(2t)-\cos(t)-\frac{1}{5}\cos(5t)-\frac{1}{7}\cos(7t)-\frac{1}{11}\cos(11t)$ , uniformly in $T_{0}$ , where the notation of $x_{i_{j}}$ represents the $j^{th}$ element of state $x_{i}$ . Since there are fourteen unknown parameters, and the desired trajectory contains six distinct frequencies, the closed-loop system is not guaranteed to be persistently excited.

The simulation utilizes Euler forward numerical integration using a sample time of $\Delta_{t}=0.001$ seconds. Past $\frac{\tau_{1}+\tau_{2}+\tau_{3}}{\Delta_{t}}$ values of the state, $x_{1}$ , and the control input, $u$ , are stored in a buffer. The matrices $\mathcal{F}$ and $\mathcal{G}$ for the parameter update law in (65) are computed using trapezoidal integration of the data stored in the aforementioned buffer. Values of $\mathcal{F}$ and $\mathcal{G}$ are stored in the history stack and are updated so as to maximize the minimum eigenvalue of $\mathscr{G}$ .

The initial estimates of the unknown parameters are selected to be zero, and the history stack is initialized so that all the elements of the history stack are zero. Data is added to the history stack using a singular value maximization algorithm. To demonstrate the utility of the developed method, three simulation runs are performed. In the first run, the parameter estimator has access to noise free measurements of the output, $x_{1}$ . In the second and the third runs, a zero-mean Gaussian noise with variance 0.001 and 0.01, respectively, is added to the output signal to simulate measurement noise. The values of various simulation parameters selected for the three runs are $\tau_{1}=1.5$ , $\tau_{2}=1.2$ , $\tau_{3}=1.0$ , $N=350$ , $\Gamma\left(T_{0}\right)=\text{I}_{14}$ , $\beta_{1}=0.4$ , $\alpha=0.5$ , $k=10$ , $\beta=2$ , $\alpha_{1}=1$ , and $k_{\theta}=\nicefrac{{2}}{{N}}$ . Figure 3(a) demonstrates that in absence of noise, the developed parameter estimator drives the parameter estimation error, $\tilde{\theta}$ , to the origin. Figures 3(b) and 3(c) indicate that the developed method is robust to measurement noise, and results in convergence rates that are similar to the noise-free case, with a small increase in the steady state error due to measurement noise.

A one-at-a-time sensitivity analysis was performed on the parameters $\tau_{1},\tau_{2},\tau_{3},\beta_{1},$ and $k_{\theta}$ to gauge robustness of the developed technique. As demonstrated by the results in Table I, the developed method is robust to small changes in the integration intervals and learning gains.

IV-B Nonlinear System

The developed state and parameter estimator is validated using a simulation study involving a two-link robot manipulator arm, where $x_{1}\in\mathbb{R}^{2}$ denotes the angular position of the two links, $x_{2}\in\mathbb{R}^{2}$ denotes the angular velocities of the two links, and $x=\begin{bmatrix}x^{T}_{1}&x^{T}_{2}\end{bmatrix}^{T}.$

The selected model belongs to a sub-class of systems in (1), where the function approximation error, $\varepsilon$ , is zero. The model is selected because the ideal parameters, $\theta,$ are known, and as a result, the model facilitates direct quantitative analysis of the parameter estimation error.

The nonlinear dynamics of the system are described by (1), where

[TABLE]

In (73), $u\in\mathbb{R}^{2}$ is the control input,

[TABLE]

$M\left(x_{1}\right)\mathrel{\mathop{\mathchar 58\relax}}=\begin{bmatrix}a_{1}+2a_{3}\textnormal{c}_{2}\left(x_{1}\right),&a_{2}+a_{3}\textnormal{c}_{2}\left(x_{1}\right)\\ a_{2}+a_{3}\textnormal{c}_{2}\left(x_{1}\right),&a_{2}\end{bmatrix},$ and $V_{m}\left(x_{1},x_{2}\right)\mathrel{\mathop{\mathchar 58\relax}}=\begin{bmatrix}-a_{3}\textnormal{s}_{2}\left(x_{1}\right)x_{2_{2}},&-a_{3}\textnormal{s}_{2}\left(x_{1}\right)\left(x_{2_{1}}+x_{2_{2}}\right)\\ a_{3}\textnormal{s}_{2}\left(x_{1}\right)x_{2_{1}},&0\end{bmatrix},$ where $\textnormal{c}_{2}\left(x_{1}\right)=\cos\left(x_{1_{2}}\right),$ $\textnormal{s}_{2}\left(x_{1}\right)=\sin\left(x_{1_{2}}\right)$ , and $a_{1}=3.473$ , $a_{2}=0.196$ , and $a_{3}=0.242$ are known constants. The system has four unknown parameters. The ideal values of the unknown parameters are $\theta=\begin{bmatrix}5.3&1.1&8.45&2.35\end{bmatrix}^{T}.$

To satisfy Assumption 1, a controller that results in a uniformly bounded system response is needed. In this simulation study, the controller, $u$ , is selected to be a PD controller of the form $u=-k_{p}\left(x_{1}-x_{d1}\right)-k_{d}\left({x}_{2}-\dot{x}_{d1}\right)$ so that the system tracks the trajectory $x_{d1_{1}}\left(t\right)=x_{d1_{2}}\left(t\right)=-\frac{1}{3}\cos\left(3t\right)-\frac{1}{2}\cos\left(2t\right)$ , uniformly in $T_{0}$ .

The simulation utilizes Euler forward numerical integration using a sample time of $\Delta_{t}=0.001$ seconds. Past $\frac{\tau_{1}+\tau_{2}}{\Delta_{t}}$ values of the output, $x_{1}$ , state estimates, $\hat{x}$ , and the control input, $u$ , are stored in a buffer. The matrices $P$ , $\hat{G}$ , and $\hat{F}$ for the parameter update law in (25) are computed using trapezoidal integration of the data stored in the aforementioned buffer. Values of $P$ , $\hat{G}$ , and $\hat{F}$ are stored in the history stack and are updated according to the algorithm detailed in Fig. 1.

The initial estimates of the unknown parameters are selected to be zero, and the history stack is initialized so that all the elements of the history stack are zero121212It is clear from the simulation results that full rank initialization of the history stack and the normalization terms in (25) and (26) are sufficient, but not necessary conditions for the analysis in Section II-F.. Data is added to the history stack using a singular value maximization algorithm. To demonstrate the utility of the developed method, three simulation runs are performed. In the first run, the observer is assumed to have access to noise free measurements of the output, $x_{1}$ . In the second and third runs, a zero-mean Gaussian noise with variance 0.001 and variance 0.01 are added to the output signal to simulate measurement noise. The values of various simulation parameters selected for the three runs are $\tau_{1}=1.2$ , $\tau_{2}=0.9$ , $N=150$ , $\Gamma\left(T_{0}\right)=\text{I}_{4}$ , $\beta_{1}=0.7$ ( $0.9$ for variance $0.01$ ), $\alpha=2$ , $k=10$ , $\beta=2$ , $\alpha_{1}=1$ , $\kappa=0$ , and $k_{\theta}=\nicefrac{{0.5}}{{N}}$ . Figures 4(a) - 5(a) demonstrate that in the absence of noise, the developed method drives the state estimation errors, $\tilde{x}$ , and the parameter estimation errors, $\tilde{\theta}$ , to a neighborhood of the origin. Figures 4(b) - 5(c) indicate that the developed technique can be utilized in the presence of measurement noise, with expected degradation of performance.

One-at-a-time sensitivity analysis was performed on the parameters $\tau_{1},\tau_{2},\beta_{1},$ and $k_{\theta}$ to gauge robustness of the developed technique. As demonstrated by the results in Table II, the developed method is robust to small changes in the integration intervals and learning gains.

V Conclusion

This paper develops a concurrent learning based adaptive observer and parameter estimator to simultaneously estimate the unknown parameters and the states of linear and nonlinear systems using output measurements. The developed technique utilizes a dynamic state observer to generate state estimates necessary for data-driven adaptation. A purging algorithm is developed to improve the quality of the stored data as the state estimates converge to the true states.

The developed state and parameter estimation method allows for simultaneous estimation of the system states and uncertain parameters in the system model without the need for full state feedback, and facilitates parameter convergence without the requirement of PE. Theoretical guarantees for uniform ultimate boundedness of the estimation errors are established in the absence of measurement noise. Simulation results indicate that the developed method is robust to measurement noise and not sensitive to design parameters. For the class of linear systems presented, the parameter estimation can be performed independent of state estimation which facilitates exponential convergence of the parameter estimation errors. Future work will involve analyzing applicability of feedback linearization, along with a theoretical analysis of the developed method under measurement noise and process noise. A theoretical analysis of the effect of the integration intervals, $\tau_{i}$ , on the performance of the developed estimator will also be pursued.

Bibliography50

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] P. Ioannou and J. Sun, Robust adaptive control . Prentice Hall, 1996.
2[2] S. S. Sastry and M. Bodson, Adaptive control: stability, convergence, and robustness . Upper Saddle River, NJ: Prentice-Hall, 1989.
3[3] M. Krstic, I. Kanellakopoulos, and P. V. Kokotovic, Nonlinear and adaptive control design . New York, NY, USA: John Wiley & Sons, 1995.
4[4] M. A. Duarte and K. S. Narendra, “Combined direct and indirect approach to adaptive control,” IEEE Trans. Autom. Control , vol. 34, no. 10, pp. 1071–1075, Oct. 1989.
5[5] M. Krstić, P. V. Kokotović, and I. Kanellakopoulos, “Transient-performance improvement with a new class of adaptive controllers,” Syst. Control Lett. , vol. 21, no. 6, pp. 451–461, 1993.
6[6] G. Chowdhary and E. Johnson, “A singular value maximizing data recording algorithm for concurrent learning,” in Proc. Am. Control Conf. , 2011, pp. 3547–3552.
7[7] B. Anderson, “Exponential stability of linear equations arising in adaptive identification,” IEEE Trans. Autom. Control , vol. 22, no. 1, pp. 83–88, Feb. 1977.
8[8] M. Green and J. B. Moore, “Persistence of excitation in linear systems,” Syst. Control Lett. , vol. 7, no. 5, pp. 351–360, 1986.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Online Simultaneous State and Parameter Estimation

Abstract

I Introduction

II Nonlinear Systems

II-A Problem Formulation

Assumption 1**.**

Remark 1*.*

II-B Error System for Estimation

II-C State Estimator Design

II-D Parameter Estimator Design

II-E Purging

Assumption 2**.**

Remark 2*.*

II-F Analysis

Theorem 1**.**

Proof.

III Linear Systems

III-A Problem Formulation

III-B Error System for Estimation

III-C Parameter Estimator Design

Definition 1**.**

Remark 3*.*

III-D Analysis

Theorem 2**.**

Proof.

IV Simulation

IV-A Linear System

IV-B Nonlinear System

V Conclusion

Assumption 1.

*Remark 1**.*

Assumption 2.

*Remark 2**.*

Theorem 1.

Definition 1.

*Remark 3**.*

Theorem 2.