Online Simultaneous State and Parameter Estimation for Second-order Nonlinear Systems
Rushikesh Kamalapurkar

TL;DR
This paper introduces a novel online adaptive observer for second-order nonlinear systems that estimates states and parameters simultaneously without needing persistent excitation, using a Lyapunov-based approach.
Contribution
It presents a concurrent learning-based method for real-time state and parameter estimation in nonlinear systems, reducing excitation requirements compared to traditional methods.
Findings
Estimation errors are uniformly ultimately bounded.
The method works with finite excitation intervals.
No persistent excitation needed for convergence.
Abstract
In this paper, a concurrent learning based adaptive observer is developed for a class of second-order nonlinear time-invariant systems with uncertain dynamics. The developed technique results in simultaneous online state and parameter estimation. A Lyapunov-based analysis is used to show that the state and parameter estimation errors are uniformly ultimately bounded. As opposed to persistent excitation which is required for parameter estimation in traditional adaptive control methods, the developed technique only requires excitation over a finite time interval.
| Parameter | Tested Values | RMS Error Variation | Steady-State RMS Error Variation |
|---|---|---|---|
| 1.1 - 2.0 | 55.91 - 64.21 | 0.1255 - 0.1548 | |
| 0.8 - 1.7 | 56.22 - 65.61 | 0.1134 - 0.1339 | |
| 0.6 - 1.5 | 56.98 - 64.98 | 0.1206 - 0.1337 | |
| 0.05 - 0.9 | 58.50 - 63.04 | 0.1265 - 0.2509 | |
| - | 58.14 - 62.62 | 0.1161 - 0.1266 | |
| Parameter | Tested Values | RMS Error Variation | Steady-State RMS Error Variation |
|---|---|---|---|
| 0.8 - 1.7 | 0.998 - 3.848 | 0.0325 - 0.1339 | |
| 0.5 - 1.4 | 1.011 - 3.546 | 0.0294 - 0.1270 | |
| 0.1 - 1.2 | 1.224 - 1.763 | 0.0324 - 0.3273 | |
| - | 1.090 - 1.684 | 0.0296 - 0.0515 | |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Online Simultaneous State and Parameter Estimation
Ryan Self, Moad Abudia, S M Nahid Mahmud, and Rushikesh Kamalapurkar The authors are with the School of Mechanical and Aerospace Engineering, Oklahoma State University, Stillwater, OK, USA.{rself, abudia, nahid.mahmud, rushikesh.kamalapurkar}@okstate.edu. This research was supported, in part, by the National Science Foundation (NSF) under award number 1925147, and the Air Force Research Laboratories (AFRL) under award number FA8651-19-2-0009. Any opinions, findings, conclusions, or recommendations detailed in this article are those of the author(s), and do not necessarily reflect the views of the sponsoring agencies.
Abstract
In this paper, a data-driven adaptive observer is developed for a class of linear and nonlinear time-invariant systems with uncertain dynamics. The developed state-observer is utilized to generate estimates of the state from input-output data. Using the estimated state trajectories, in addition to the known system inputs, a novel data-driven parameter estimation scheme is developed to achieve simultaneous state and parameter estimation, online, for both linear and nonlinear systems. The technique results in state and parameter estimation errors that are uniformly ultimately bounded to prescribed bounds near the origin. As opposed to persistent excitation, which is required for parameter convergence in traditional adaptive observers, the developed technique only requires excitation over a finite time interval. A sensitivity analysis and simulation results in both noise-free and noisy environments are presented to validate the design.
I Introduction
Traditional adaptive control methods handle uncertainty in the system dynamics by maintaining a parametric estimate of the model and utilizing it to generate a feedforward control signal (see, e.g., [1, 2, 3]). While the feedforward-feedback architecture guarantees stability of the closed-loop, the control law is not robust to disturbances, and seldom provides information regarding the quality of the estimated model (cf. [1] and [2]). While accurate parameter estimation can improve robustness and transient performance of adaptive controllers, (see, e.g., [4, 5, 6]), parameter convergence typically requires restrictive assumptions such as persistence of excitation (PE)[1, 7, 8, 9]. An excitation signal is often added to the controller to ensure persistence of excitation; however, the added signal can cause mechanical fatigue and compromise the tracking performance. Therefore, the development of techniques that facilitate parameter convergence without the requirement of PE is motivated.
Parameter convergence can be achieved under a finite excitation condition using data-driven methods, such as concurrent learning (CL) (see, e.g., [6, 10, 11]), where the parameters are estimated by storing data during time-intervals when the system is excited, and then utilizing the stored data to drive adaptation when excitation is unavailable. In addition to parameter estimation, CL adaptive control methods also possess similar robustness to bounded disturbances as modification, modification, etc., without the associated drawbacks such as drawing the parameter estimates to arbitrary set-points [6, 12, 10, 11]. CL has been shown to be an effective tool for adaptive control (see, e.g., [6, 12, 10, 11]) and adaptive estimation (see, e.g., [13, 14, 15, 16, 17, 18]).
Adaptation techniques similar to the CL method are utilized to implement reinforcement learning under finite excitation conditions in results such as [13, 14, 15, 16, 17, 18]. CL methods have also been extended to classes of switched systems [19, 20], systems driven by stochastic processes [21], and systems with time-varying parameters [22]. A major drawback of CL methods is that they require numerical differentiation of the state measurements. CL methods that do not require numerical differentiation of the state measurements are developed in results such as [23] and [24], however, they require full state feedback. Since full state feedback is often not available, the development of an output-feedback CL framework is well-motivated. In order to achieve parameter convergence using output-feedback CL, access to simultaneous estimates of the unmeasurable states are required.
Due to advantageous properties such as the separation principle, there is a large body of literature on simultaneous state and parameter estimation for linear systems [25, 26, 27]. Estimation methods for linear systems typically use popular techniques, such as Kalman filters, because of their well-documented effectiveness. More recently, researchers have also explored the state and parameter estimation problem for nonlinear systems [28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38]. While tools such as particle filters [30], extended Kalman filters [31], multi-observers [32], and adaptive observers [33, 34, 35, 36, 37, 38] have been examined for nonlinear simultaneous state and parameter estimation, they either do not provide theoretical performance guarantees [30, 31] or require stringent assumptions [32, 33, 34, 35, 36, 37, 38], such as PE, which are generally difficult, if not impossible, to check online. While relaxed PE results are presented in [39, 40, 41], these results still require a persistent excitation condition. Therefore, this paper aims to provide both theoretical guarantees and finite (as opposed to persistent) excitation assumptions that are verifiable online.
In this paper, the preliminary results from [42] and [43] are consolidated and generalized to yield an output feedback concurrent learning method for simultaneous state and parameter estimation in uncertain linear and nonlinear systems. In particular, this paper yields a formal method for simultaneous state and parameter estimation for a broad class of dynamical systems that includes the Brunovsky canonical form studied in [42] and [43] as a special case. An adaptive state-observer is utilized to generate estimates of the state from input-output data. The estimated state trajectories along with the known inputs are then utilized in a novel data-driven parameter estimation scheme to achieve simultaneous state and parameter estimation. Convergence of the state estimates and the parameter estimates to a small neighborhood of the origin is established under a finite (as opposed to persistent) excitation condition.
The paper is organized as follows. In Section II, the class of nonlinear systems that the developed method applies to is described. An integral error system that facilitates parameter estimation is developed in Section II-B. Section II-C is dedicated to the design of a robust state observer. Section II-D details the developed parameter estimator. Section II-E details the algorithm for selection and storage of the data that is used to implement concurrent learning. Section II-F is dedicated to a Lyapunov-based analysis of the developed technique. In Section III, linear systems are considered. A linear error system is developed in Section III-B to facilitate CL-based adaptation. A CL-based parameter estimator is designed in Section III-C. A Lyapunov-based stability analysis of the parameter estimator is presented in Section III-D. Section IV demonstrates the efficacy of the developed method via a numerical simulation and Section V concludes the paper.
II Nonlinear Systems
II-A Problem Formulation
Consider a nonlinear system of the form
[TABLE]
where and denote the state variables, is the system state, and are known and locally Lipschitz continuous, is locally Lipschitz continuous, is the controller, denotes the output, and denotes the measurable part of the system state. The model, , is comprised of a known nominal part and an unknown part, i.e., , where is known and locally Lipschitz and is unknown and locally Lipschitz. The objective is to design an adaptive estimator to identify the state, , and the unknown function, , online, using input-output measurements.
Systems of the form (1) encompass -order linear systems and Euler-Lagrange models with invertible inertia matrices, and hence, represent a wide class of physical plants, including but not limited to robotic manipulators and autonomous ground, aerial, and underwater vehicles.
Assumption 1**.**
A compact set such that 111For the notation denotes the interval and the notation denotes the interval . and is known, where denotes the initial time.
Remark 1*.*
The problem formulation in (1) incorporates commonly occurring dynamical systems described using the Brunovsky canonical form [44]
[TABLE]
and the extended Brunovsky form
[TABLE]
where denote the state variables, is the system state, and are locally Lipschitz continuous, is the controller, denotes the output, and denotes the measureable part of the system state.
II-B Error System for Estimation
Given a constant , there exist and , such that the unknown function can be approximated, over the compact set , using basis functions as , where denotes the approximation error, is a constant matrix of unknown parameters, and , , , , and [45, 46]. To obtain an error signal for parameter identification, the system in (1) is expressed in the form
[TABLE]
Integrating (4) over the interval for some constant and then over the interval for some constant ,
[TABLE]
and denotes the double integral operator
[TABLE]
Using the Fundamental Theorem of Calculus and the fact that , for almost all ,
[TABLE]
and denotes the single integral operator
[TABLE]
The expression in (7) can be rearranged to form the affine system
[TABLE]
where222The matrices , and are evaluated along the trajectories of (1), and as such, are functions of , and . Since the bound on and imposed by Assumption 1 is uniform in , the dependence of , and on is not relevant to the subsequent analysis, and as such, is not made explicit in the notation.
[TABLE]
[TABLE]
[TABLE]
[TABLE]
The affine relationship in (7) is valid for all ; however, it provides useful information about the vector only after . In the following, (7) will be used to solve the simultaneous state and parameter estimation problem.
While (7) can be used to learn the unknown parameters, , knowledge of the state variable is required to compute the matrices and . A robust adaptive state estimator is developed in the following to generate estimates of .
II-C State Estimator Design
To generate estimates of , a state estimator inspired by [47] is developed. The estimator is given by
[TABLE]
where , , , , , and are estimates of , , , , and , respectively, and is a feedback term designed in the following.
To facilitate the design of , let the state and parameter estimation errors be defined as
[TABLE]
and define the model error as
[TABLE]
where is a positive constant, and . The feedback component is designed as
[TABLE]
where the signal is added to compensate for the fact that the state variable is not measurable. Based on the subsequent stability analysis, the signal is designed as the output of the dynamic filter
[TABLE]
where and are positive constants and the error signal is defined as
[TABLE]
and .
Using integration by parts to eliminate the auxiliary variable , the dynamic filter can be expressed in the equivalent form
[TABLE]
In the following, the filter in (II-C) is used for implementation and the filter in (18), which is not implementable due to its dependence on , is used for analysis.
Since and are locally Lipschitz, given a compact set , Assumption 1 can be used to conclude that there exists an , independent of , such that
[TABLE]
and
[TABLE]
To generate the estimates , a concurrent learning [48] technique that utilizes only the output measurements is developed, motivated by the affine error system in (7).
II-D Parameter Estimator Design
To obtain an output-feedback concurrent learning update law for the parameter estimates, a history stack, denoted by , is utilized. A history stack is defined as a set of ordered pairs such that
[TABLE]
where is a matrix with an induced -norm that is small enough in a sense that is made precise in the subsequent analysis. Typically, a history stack that satisfies (20) is not available a priori. The history stack is recorded online using the relationship in (7), by selecting an increasing set of time-instances (see Fig. 1) and letting
[TABLE]
where333The matrices and are evaluated along the trajectories of (12), (18), (25), and (26), and as such, depend on , , , , and . For brevity of notation, the matrices are denoted as functions of time.
[TABLE]
[TABLE]
where denotes the double integral operator
[TABLE]
and denotes the single integral operator
[TABLE]
In this case, the error term is given by Let be an interval over which the history stack was recorded. Provided the states and the state estimation errors remain within the compact sets and , respectively,444 and . over , the error terms can be bounded as
[TABLE]
where and are constants.
The concurrent learning update law to estimate the unknown parameters is designed as
[TABLE]
where is a constant adaptation gain and is the least-squares gain updated using the update law
[TABLE]
where the matrix is defined as and .
II-E Purging
The update law in (25) is motivated by the fact that if the full state were available for feedback and if the approximation error, , were zero, then using
[TABLE]
the parameters could be estimated via the least squares estimate
[TABLE]
However, since the history stack contains the estimated terms and , during the transient period where the state estimation error is large, the history stack does not accurately (within the error bound introduced by ) represent the system dynamics. Hence, the history stack needs to be purged whenever better estimates of the state are available.
Since the state estimator exponentially drives the estimation error to a small neighborhood of the origin, a newer estimate of the state can be assumed to be at least as good as an older estimate, apart from the small error introduced by practical stability of the estimator. This fact motivates the dwell time based greedy purging algorithm developed in the following to utilize newer data for estimation while preserving stability of the estimator.
The algorithm maintains two history stacks, a main history stack and a transient history stack, labeled and , respectively. As soon as the transient history stack is full and sufficient dwell time has passed, the main history stack is emptied and the transient history stack is copied into the main history stack. A lower bound on the required dwell time, denoted by , is determined in Section II-F using a Lyapunov-based stability analysis.
Parameter identification in the developed framework requires a full rank history stack , which is achieved provided the trajectories contain sufficient information, as quantified by the following assumption.
Assumption 2**.**
There exist such that for all , , , and system trajectories in response to the controllers , there exist and time instances , such that a history stack recorded using Fig. 1 satisfies
[TABLE]
where denotes the minimum singular value of a matrix.
Remark 2*.*
Uniformity of excitation, with respect to initial conditions and the true state and control trajectories, is required for uniform stability of the estimator (cf. [39]). If uniformity of excitation cannot be guaranteed, then, as long as (29) holds for a specific set of initial conditions and state and control trajectories, the estimation error of the developed state and parameter estimator, starting from the given initial conditions and evaluated along the given true state and control trajectories, can be shown to be ultimately bounded using analysis techniques similar to Section II-F.
Motivated by the observation that the rate of decay of the parameter estimation errors is proportional to the minimum singular value of , a singular value maximization algorithm is used to select the time instances . That is, a data-point in the history stack is replaced with a new data-point , where , , and , for some , only if
[TABLE]
where denotes the minimum singular value of a matrix, is a tunable constant, , , and . To simplify the analysis, it is assumed that new data points are only collected seconds after a purging event. Since the history stack is updated using a singular value maximization algorithm, the matrix is a piece-wise constant function of time with the property that once it satisfies (29), at some , and for some , the condition holds for all . The developed purging algorithm is summarized in Fig. 1.
A Lyapunov-based analysis showing uniform ultimate boundedness of the parameter and the state estimation errors is presented in the following section.
II-F Analysis
Each purging event represents a discontinuous change in the system dynamics; hence, the resulting closed-loop system is a switched system. To facilitate the analysis of the switched system, let denote a switching signal such that , and , where denotes the number of times the update was carried out over the time interval . For a given , let denote the history stack active during the time interval , containing the elements , and let be the corresponding error term. To simplify the notation, let , and .
Using (20) and (25), the dynamics of the parameter estimation error can be expressed as
[TABLE]
Since the functions and are piece-wise continuous, the trajectories of (31), and of all the subsequent error systems involving and and , are defined in the sense of Carathéodory[49]. The algorithm in Fig. 1 ensures that there exists a constant such that .
Using the dynamics in (1), (12)
- (18), and the design of the feedback component in (15), the evolution of the error signal is described by
[TABLE]
where and . Since and are locally Lipschitz, given a compact set , Assumption 1 can be used to conclude that there exist , independent of , such that
[TABLE]
and
[TABLE]
To facilitate the analysis, let be a set of switching time instances defined as That is, for a given switching index denotes the time instance when the th subsystem is switched on. The analysis is carried out separately over the time intervals , , where . Since the history stack is not updated over the intervals , , the matrices and are constant over each interval. The history stack that is active over the interval is denoted by . To ensure boundedness of the trajectories in the interval , the history stack is computed using arbitrarily selected trajectories that are confined within and make full rank555Arbitrary selection of results in potentially large initial error in (20). While large could potentially result in large parameter estimation errors, , during , as long as is full rank, the first term in (31) ensures that remains bounded over .. The analysis is carried out over the aforementioned intervals using the state vectors Z\mathrel{\mathop{\mathchar 58\relax}}=\left[\begin{array}[]{cccccc}\tilde{x}_{1}^{T}&\tilde{x}_{2}^{T}&r^{T}&\eta^{T}&\mbox{vec}\left(\tilde{\theta}\right)^{T}\end{array}\right]^{T}\in\mathbb{R}^{n_{1}+3n_{2}+p} and .
A summary of the stability analysis is provided in the following, along with a graphical representation in Fig. 2.
Interval 1: First, it is established that is bounded over , where the bound is 666 denotes that there exists such that .. Then, for a given , the bound on is utilized to select state estimator gains such that .
Interval 2: The history stack , which is active over , is recorded over . Without loss of generality, it can be ensured that represents the system better than (which is arbitrarily selected), that is, . The bound on over is then shown to be smaller than that over , which is utilized to show that for all .
*Interval 3: *Using (24), the errors are shown to be where denotes the value of at the time when the point was recorded. Using the facts that the history stack , which is active over , is recorded over and for all , the error is shown to be . If then it is established that . If then the fact that the bound on over is smaller than that over is utilized to show that for all . The analysis is then continued in an inductive argument to show that and for all .
The stability result is summarized in the following theorem.
Theorem 1**.**
Let be given. If Assumptions 1 and 2 hold, the history stacks and are populated using the algorithm detailed in Fig. 1, the learning gains selected to satisfy the sufficient gain conditions in (38), (39), (44), and (48), there exists a time instance such that the system states are informative over , that is, the history stack can be replenished if purged at any time , over each switching interval , let the dwell-time, , is selected such that , where is selected to be large enough to satisfy (47), and if the excitation interval is large enough so that ,777A minimum of two purges are required to remove the randomly initialized data, and the data recorded during transient phase of the derivative estimator from the history stack. then .
Proof.
Consider the candidate Lyapunov function
[TABLE]
Using arguments similar to [1, Corollary 4.3.2], it can be shown that provided and Assumption 2 holds, the least squares gain matrix satisfies
[TABLE]
where and are positive constants, and denotes an identity matrix.
The bound in (35) implies that the candidate Lyapunov function satisfies
[TABLE]
where and . For brevity, function dependencies will be omitted over the rest of the analysis.
Over the time interval , the orbital derivative of is given by888 where is constructed using (18), (17), (26), (31), and (32) so that .
[TABLE]
Assuming that was computed using values of that correspond to trajectories that stay inside , the orbital derivative can be bounded by
[TABLE]
where , is a positive constant such that , and the bounds and depend on the compact set . Provided
[TABLE]
then (37) simplifies to
[TABLE]
Since in the domain
[TABLE]
That is, is negative definite on provided was computed using values of that correspond to trajectories that stay inside , and provided , where
[TABLE]
and . Theorem 4.18 from [50] can be invoked to conclude that provided the gain condition
[TABLE]
holds, where is a constant, then .
In particular, by initializing using arbitrary values of that satisfy for all , it can be concluded that
[TABLE]
where is a constant such that . Using the relationships in (36) and (40), it can further be concluded
[TABLE]
If it were possible to use the inequality in (40) to conclude that , then an inductive argument could be used to show that the trajectories decay to a neighborhood of the origin. However, unless the history stack can be selected to have arbitrarily large minimum singular value (which is generally not possible), the constant cannot be made arbitrarily small using the learning gains.
Since depends on , it can be made smaller by reducing the estimation errors and thereby reducing the errors associated with the data stored in the history stack. To that end, consider the candidate Lyapunov function
[TABLE]
The candidate Lyapunov function satisfies
[TABLE]
where , .
The orbital derivative of is given by999 where is constructed using (18), (17), and (32) so that .
[TABLE]
If is bounded over , then using Cauchy-Schwartz inequality, the orbital derivative can be simplified and bounded over as
[TABLE]
where is a constant such that
[TABLE]
In particular, consider the time interval . Using the fact that is bounded over , provided
[TABLE]
then the time-derivative of over can be bounded as
[TABLE]
and . That is, for all
[TABLE]
where is a constant such that . In particular,
[TABLE]
Provided the dwell time is large enough so that
[TABLE]
then from (40) and (45), and . In particular, and . Note that the bound on can be made arbitrarily small by increasing and .
Now the interval is considered. Given any arbitrary bound , a compact set , and the learning gains that satisfy the resulting gain conditions in (38), (39), and (44), can be selected such that 101010 denotes the closed ball of radius around the origin., and as a result from (46) it follows that for all . Since the history stack , which is active during , is recorded during , the bound in (24) can be used to show that .
Since is independent of the system trajectories, can be selected, without loss of generality, such that , and hence, . Thus, provided the constant (and as a result, the gain ) is selected large enough so that
[TABLE]
the gain condition in (39) holds over , and hence, a similar Lyapunov-based analysis, along with the bound can be utilized to conclude that ,
[TABLE]
The sufficient condition in (48) implies that and hence, (41) and imply that .
Since , the gain conditions in (44) hold over the interval . A Lyapunov-based analysis similar to (42)-(46) yields From (47), , and hence, ,
[TABLE]
Now, the interval is considered. By selecting large enough, it can be ensured that , and as a result, . Since the history stack , which is active during , is recorded during , the bounds in (24) and (50) can be used to show that Since , it follows that , which implies . Provided satisfies (47), then , which implies , and hence, and . Therefore, the gain conditions in (38), (39), and (44) are satisfied over .
Since the gain conditions are satisfied, a Lyapunov-based analysis similar to (42) - (46) yields . Given any the gains and can be selected large enough to satisfy and hence, Furthermore, a similar Lyapunov-based analysis as (34)
- (40) yields . If then , which, from and implies that .
If then an inductive continuation of the Lyapunov-based analysis to the time intervals shows that provided the dwell time satisfies (47), then the gain conditions in (38), (39), and (44) are satisfied for all , the state satisfies
[TABLE]
, and , , , and , for all .
The bound in (51) and the fact that indicate that . Furthermore, , , , which, along with the dwell time requirement, implies that , and hence, .
∎
III Linear Systems
When the system under consideration is linear, parameter estimation can be directly achieved using measurements of and without using state estimation. The following section details an output-feedback parameter estimator using as the output. The accompanying state estimator for linear systems is a trivial application of the estimator in Section II-C, and has been omitted.
III-A Problem Formulation
Consider a linear system of the form
[TABLE]
where denote the state variables, is the system state, is the controller, and denote the system matrices, and denotes the output. The objective is to design an adaptive estimator to identify the unknown matrices and , online, using input-output measurements.
III-B Error System for Estimation
To obtain an error signal for parameter identification, the system in (52) is expressed in the form
[TABLE]
where , , , and are constant matrices such that . Integrating (53) over the interval for some constant ,
[TABLE]
Integrating again over the interval for some constant ,
[TABLE]
Using the Fundamental Theorem of Calculus and the fact that ,
[TABLE]
Repeating this process more time, results in
[TABLE]
where
[TABLE]
[TABLE]
[TABLE]
[TABLE]
and . As opposed to nonlinear systems in Section II-B, where measurements of all states but the final state are required for parameter estimation, the integral form in (56) is independent of the state variables , and depends only on the output, . The expression in (56) can be rearranged to form the linear error system
[TABLE]
In (61), is a vector of unknown parameters, defined as \theta\mathrel{\mathop{\mathchar 58\relax}}=\Big{[}\text{vec}\left(A_{1}\right)^{T}\ \text{vec}\left(A_{2}\right)^{T} \ldots\ \text{vec}\left(A_{N}\right)^{T}\ \text{vec}\left(B\right)^{T}\Big{]}^{T}\in\mathbb{R}^{Nn^{2}+mn}, where denotes the vectorization operator and the matrices and are defined as
[TABLE]
where denotes an identity matrix, and denotes the Kronecker product. Note that even though the linear relationship in (61) is valid for all it provides useful information about the vector only after .
The linear error system in (61) motivates the adaptive estimation scheme that follows.
III-C Parameter Estimator Design
To obtain output-feedback concurrent learning update law for the parameter estimates, a history stack denoted by is utilized. The history stack is a set of ordered pairs such that
[TABLE]
Note that from (20) is absent from (62), since there are no estimated state variables in or .
If a history stack that satisfies (63) is not available a priori, it can be recorded online, using the relationship in (61), by selecting a set of time-instances and letting
[TABLE]
Furthermore, a singular value maximization algorithm is used to select the time instances . That is, a data-point in the history stack is replaced by a new data-point , where and , for some , only if
[TABLE]
where denotes the minimum Eigenvalue of a matrix.
Since the time instances, , vary according to the minimum singular value maximization algorithm, the history stacks, and , are time-varying and piece-wise constant. The following definition establishes a uniform lower bound for the time-varying history stacks to facilitate the analysis that directly follows.
Definition 1**.**
A history stack is called *uniformly full rank *if there exists a constant such that
[TABLE]
where the matrix is defined as .
The concurrent learning update law to estimate the unknown parameters is then given by
[TABLE]
and the least square update law is
[TABLE]
Remark 3*.*
To facilitate the following Lyapunov analysis, using (61) and (65), the parameter estimation error can be expressed as
[TABLE]
Since the function is piece-wise continuous, the trajectories of (67) and all the subsequent functions involving , are defined in the sense of Carathéodory[49].
III-D Analysis
The following theorem establishes exponential convergence of the parameter estimates.
Theorem 2**.**
If there exists a time such that the history stack is uniformly full rank, then the parameter estimates, , updated using the parameter estimator in (65), converge to , exponentially over the interval .
Proof.
Consider the following positive definite candidate Lyapunov function
[TABLE]
Using arguments similar to [1, Corollary 4.3.2], it can be shown that provided and Assumption 2 holds, the least squares gain matrix satisfies
[TABLE]
The candidate Lyapunov function satisfies
[TABLE]
where (69) implies that the the bounds, and , in (70) are established independent of .
Using (65) and (66), along with the identity , the time-derivative of (68) results in111111 where is constructed using (61) and (65) so that .
[TABLE]
Simplifying (71), becomes
[TABLE]
During the time interval , when is not full rank, Theorem 4.8 from [50] can be used to show uniform boundedness of . Once the history stack becomes full rank in the sense of Def. 1, using (68) and (72), along with the bounds in (64) and (69), Theorem 4.10 from [50] can be invoked to conclude that converges to the origin, exponentially over the interval . ∎
IV Simulation
IV-A Linear System
The linear system selected for the simulation study is given by
[TABLE]
To satisfy Assumption 1, a controller that results in a uniformly bounded system response is needed. In this simulation study, the controller, , is selected to be a PD controller of the form so that the system tracks the trajectory , uniformly in , where the notation of represents the element of state . Since there are fourteen unknown parameters, and the desired trajectory contains six distinct frequencies, the closed-loop system is not guaranteed to be persistently excited.
The simulation utilizes Euler forward numerical integration using a sample time of seconds. Past values of the state, , and the control input, , are stored in a buffer. The matrices and for the parameter update law in (65) are computed using trapezoidal integration of the data stored in the aforementioned buffer. Values of and are stored in the history stack and are updated so as to maximize the minimum eigenvalue of .
The initial estimates of the unknown parameters are selected to be zero, and the history stack is initialized so that all the elements of the history stack are zero. Data is added to the history stack using a singular value maximization algorithm. To demonstrate the utility of the developed method, three simulation runs are performed. In the first run, the parameter estimator has access to noise free measurements of the output, . In the second and the third runs, a zero-mean Gaussian noise with variance 0.001 and 0.01, respectively, is added to the output signal to simulate measurement noise. The values of various simulation parameters selected for the three runs are , , , , , , , , , , and . Figure 3(a) demonstrates that in absence of noise, the developed parameter estimator drives the parameter estimation error, , to the origin. Figures 3(b) and 3(c) indicate that the developed method is robust to measurement noise, and results in convergence rates that are similar to the noise-free case, with a small increase in the steady state error due to measurement noise.
A one-at-a-time sensitivity analysis was performed on the parameters and to gauge robustness of the developed technique. As demonstrated by the results in Table I, the developed method is robust to small changes in the integration intervals and learning gains.
IV-B Nonlinear System
The developed state and parameter estimator is validated using a simulation study involving a two-link robot manipulator arm, where denotes the angular position of the two links, denotes the angular velocities of the two links, and
The selected model belongs to a sub-class of systems in (1), where the function approximation error, , is zero. The model is selected because the ideal parameters, are known, and as a result, the model facilitates direct quantitative analysis of the parameter estimation error.
The nonlinear dynamics of the system are described by (1), where
[TABLE]
In (73), is the control input,
[TABLE]
and where , and , , and are known constants. The system has four unknown parameters. The ideal values of the unknown parameters are
To satisfy Assumption 1, a controller that results in a uniformly bounded system response is needed. In this simulation study, the controller, , is selected to be a PD controller of the form so that the system tracks the trajectory , uniformly in .
The simulation utilizes Euler forward numerical integration using a sample time of seconds. Past values of the output, , state estimates, , and the control input, , are stored in a buffer. The matrices , , and for the parameter update law in (25) are computed using trapezoidal integration of the data stored in the aforementioned buffer. Values of , , and are stored in the history stack and are updated according to the algorithm detailed in Fig. 1.
The initial estimates of the unknown parameters are selected to be zero, and the history stack is initialized so that all the elements of the history stack are zero121212It is clear from the simulation results that full rank initialization of the history stack and the normalization terms in (25) and (26) are sufficient, but not necessary conditions for the analysis in Section II-F.. Data is added to the history stack using a singular value maximization algorithm. To demonstrate the utility of the developed method, three simulation runs are performed. In the first run, the observer is assumed to have access to noise free measurements of the output, . In the second and third runs, a zero-mean Gaussian noise with variance 0.001 and variance 0.01 are added to the output signal to simulate measurement noise. The values of various simulation parameters selected for the three runs are , , , , ( for variance ), , , , , , and . Figures 4(a) - 5(a) demonstrate that in the absence of noise, the developed method drives the state estimation errors, , and the parameter estimation errors, , to a neighborhood of the origin. Figures 4(b) - 5(c) indicate that the developed technique can be utilized in the presence of measurement noise, with expected degradation of performance.
One-at-a-time sensitivity analysis was performed on the parameters and to gauge robustness of the developed technique. As demonstrated by the results in Table II, the developed method is robust to small changes in the integration intervals and learning gains.
V Conclusion
This paper develops a concurrent learning based adaptive observer and parameter estimator to simultaneously estimate the unknown parameters and the states of linear and nonlinear systems using output measurements. The developed technique utilizes a dynamic state observer to generate state estimates necessary for data-driven adaptation. A purging algorithm is developed to improve the quality of the stored data as the state estimates converge to the true states.
The developed state and parameter estimation method allows for simultaneous estimation of the system states and uncertain parameters in the system model without the need for full state feedback, and facilitates parameter convergence without the requirement of PE. Theoretical guarantees for uniform ultimate boundedness of the estimation errors are established in the absence of measurement noise. Simulation results indicate that the developed method is robust to measurement noise and not sensitive to design parameters. For the class of linear systems presented, the parameter estimation can be performed independent of state estimation which facilitates exponential convergence of the parameter estimation errors. Future work will involve analyzing applicability of feedback linearization, along with a theoretical analysis of the developed method under measurement noise and process noise. A theoretical analysis of the effect of the integration intervals, , on the performance of the developed estimator will also be pursued.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] P. Ioannou and J. Sun, Robust adaptive control . Prentice Hall, 1996.
- 2[2] S. S. Sastry and M. Bodson, Adaptive control: stability, convergence, and robustness . Upper Saddle River, NJ: Prentice-Hall, 1989.
- 3[3] M. Krstic, I. Kanellakopoulos, and P. V. Kokotovic, Nonlinear and adaptive control design . New York, NY, USA: John Wiley & Sons, 1995.
- 4[4] M. A. Duarte and K. S. Narendra, “Combined direct and indirect approach to adaptive control,” IEEE Trans. Autom. Control , vol. 34, no. 10, pp. 1071–1075, Oct. 1989.
- 5[5] M. Krstić, P. V. Kokotović, and I. Kanellakopoulos, “Transient-performance improvement with a new class of adaptive controllers,” Syst. Control Lett. , vol. 21, no. 6, pp. 451–461, 1993.
- 6[6] G. Chowdhary and E. Johnson, “A singular value maximizing data recording algorithm for concurrent learning,” in Proc. Am. Control Conf. , 2011, pp. 3547–3552.
- 7[7] B. Anderson, “Exponential stability of linear equations arising in adaptive identification,” IEEE Trans. Autom. Control , vol. 22, no. 1, pp. 83–88, Feb. 1977.
- 8[8] M. Green and J. B. Moore, “Persistence of excitation in linear systems,” Syst. Control Lett. , vol. 7, no. 5, pp. 351–360, 1986.
