Moving Horizon Estimation for ARMAX process with t-Distribution Noise
Dexiang Zhou, Keck Voon Ling, Weng Khuen Ho, Jan M., Maciejowski

TL;DR
This paper introduces a robust moving horizon estimator for ARMAX processes with t-distribution noise, improving outlier resistance and computational efficiency over traditional methods.
Contribution
It proposes a novel recursive estimator based on influence functions for ARMAX models with t-distribution noise, enhancing robustness and speed.
Findings
Estimator has smaller variance than least-squares
More robust to outliers than traditional methods
Significantly faster than particle filters
Abstract
In this paper, instead of the usual Gaussian noise assumption, -distribution noise is assumed. A Maximum Likelihood Estimator using the most recent N measurements is proposed for the Auto-Regressive-Moving-Average with eXogenous input (ARMAX) process with this assumption. The proposed estimator is robust to outliers because the `thick tail' of the t-distribution reduces the effect of large errors in the likelihood function. Instead of solving the resulting nonlinear estimator numerically, the Influence Function is used to formulate a computationally efficient recursive solution, which reduces to the traditional Moving Horizon Estimator when the noise is Gaussian. The formula for the variance of the estimate is derived. This formula shows explicitly how the variance of the estimate is affected by the number of measurements and noise variance. The simulation results show that the…
| Example | 1 | 2 | 3 |
|---|---|---|---|
| 6,9,12 | 3 | ||
| 6 | 9 | 12 | 40 | 60 | |||
| Proposed estimator Eq. (20) | Theoretical results from (24) | N | |||||
| 0.6124 | 0.7055 | 0.7675 | 0.8617 | 0.8620 | |||
| - | 0.6879 | 0.7499 | 0.8441 | 0.8444 | |||
| - | - | 0.7237 | 0.8180 | 0.8183 | |||
| Simulation results | 0.5946 | 0.6834 | 0.7416 | 0.8486 | 0.8729 | ||
| - | 0.6723 | 0.7543 | 0.8338 | 0.8546 | |||
| - | - | 0.7065 | 0.7881 | 0.8343 | |||
| MWLSE Eq. (22) | Simulation results | 0.9773 | 1.0705 | 1.1320 | 1.2270 | 1.2270 | |
| - | 1.0360 | 1.0980 | 1.1920 | 1.1920 | |||
| - | - | 1.0460 | 1.1400 | 1.1400 | |||
| ARMAX filter[7] | Simulation results | - | 0.5946 | 0.6723 | 0.7065 | 0.4883 | 0.4889 |
| Kalman filter | Simulation results | - | 0.9773 | 1.0360 | 1.0460 | 0.4881 | 0.4889 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTarget Tracking and Data Fusion in Sensor Networks · Fault Detection and Control Systems · Structural Health Monitoring Techniques
Moving Horizon Estimation for ARMAX process with t -Distribution Noise
Dexiang Zhou [email protected]
Keck Voon Ling [email protected]
Weng Khuen Ho [email protected]
Jan M. Maciejowski [email protected] School of Electrical Electronic Engineering, Nanyang Technological University, Singapore
Department of Electrical Computer Engineering, National University of Singapore, Singapore
Department of Engineering, University of Cambridge, Trumpington Street, Cambridge CB2 1PZ, UK
Abstract
In this paper, instead of the usual Gaussian noise assumption, -distribution noise is assumed. A Maximum Likelihood Estimator using the most recent N measurements is proposed for the Auto-Regressive-Moving-Average with eXogenous input (ARMAX) process with this assumption. The proposed estimator is robust to outliers because the ‘thick tail’ of the t-distribution reduces the effect of large errors in the likelihood function. Instead of solving the resulting nonlinear estimator numerically, the Influence Function is used to formulate a computationally efficient recursive solution, which reduces to the traditional Moving Horizon Estimator when the noise is Gaussian.The formula for the variance of the estimate is derived. This formula shows explicitly how the variance of the estimate is affected by the number of measurements and noise variance. The simulation results show that the proposed estimator has smaller variance and is more robust to outliers than the Moving Window Least-Squares Estimator. For the same accuracy, the proposed estimator is an order of magnitude faster than the particle filter.
keywords:
Robust Estimation; Moving Horizon Estimation; ARMAX Process; t -Distribution Noise; Influence Function
††thanks: The authors acknowledge support by the Singapore National Research Foundation (NRF) under its Campus for Research Excellence And Technological Enterprise (CREATE) programme, specifically the Cambridge Centre for Advanced Research and Education in Singapore (Cambridge CARES, http://www.cares.cam.ac.uk), project C4T. This paper was not presented at any IFAC meeting. Corresponding author Dexiang Zhou. Tel. +65 8376 0221.
, , ,
1 Introduction
Gaussian noise is often assumed in state estimation. However, the Gaussian noise assumption is an approximation to reality. The occurrence of outliers, transient data in steady-state measurements, instrument failure, model nonlinearity, etc. can all induce non-Gaussian data [16].
It is well-known that the -distribution has the property of ‘ thick tail’ to better model the occurence of outliers [10]. In addition, as a special case, the -distribution reduces to the Gaussian distribution when its degree of freedom tends to infinity. Thus the -distribution has the flexibility to characterize noise with Gaussian or non-Gaussian statistical properties. In the literature the heavy tail property of the -distribution has been used to improve the performance of estimators in [13, 2, 1, 15]. These papers are based on the Bayesian method and the heavy tail property of -distribution was used to reduce the effect of outliers. Our method differs from these estimators in that the probability density function (pdf) of the -distribution is used directly in the likelihood function for a Maximum Likelihood Estimator (MLE).
The sensitivity of an estimator when the underlying noise assumption (such as Gaussian noise) is violated has been extensively studied in the robust statistics literature [9, 6, 8, 12]. In particular, Huber [9] studied the effect of outliers by contaminating the underlying noise distribution with data from an arbitrary unknown distribution. Hampel [6] proposed the Influence Function (IF) approach to describe the effect of an infinitesimal contamination of the underlying noise distribution. If the IF of an estimator is bounded and/or decreasing, or increasing slowly for large magnitude of noise, then the estimator is robust to outliers. It can be shown that the IF of the least-squares estimator increases linearly with the magnitude of noise and is unbounded [6]. This confirms the well-known fact that standard least-squares estimation is not robust to outliers. Other approaches for handling non-Gaussian noise include the particle filter [3] which is based on the Monte Carlo method, but at the expense of a heavier computational load.
IF has been mainly used as an analysis tool. The main contribution of this paper, Theorem 6, is to employ the IF for state estimation. We formulate a computationally efficient recursive algorithm that gives an approximate solution to the moving horizon MLE for Auto-Regressive-Moving-Average with eXogenous inputs (ARMAX) process with -distribution noise.
ARMAX processes are popularly used in the statistical analysis of time series [4]. The Kalman filter for an ARMAX process is well-known, see Example 11.6 in [4]. In [7] a MLE of an ARMAX process with generalized -distribution noise was developed. However, in [7] the IF was used to approximate the MLE by using all the measurements. This makes the estimator insensitive to plant parameter changes [11]. In this paper we derive the moving horizon version of the ARMAX filter. We derive a computationally efficient recursive algorithm in Section 3. We also analyze the statistical properties of the proposed estimator in Section 4. More specifically, the formula for the variance of the estimate is derived.
2 Maximum Likelihood Estimation and its approximation for ARMAX process with distribution noise
Consider the following single-input single-output ARMAX process:
[TABLE]
where is the sampling instance, and are the input and output respectively. The polynomials are assumed to be co-prime and given as
[TABLE]
where and is the backward shift operator, i.e., . The zeros of the polynomials and are inside the unit disc. is an independent and identically distributed random variable associated with the zero mean -distribution pdf [10]
[TABLE]
The degree of freedom and scale parameters are given by and respectively, and is the gamma function.
For convenience, we will derive the estimator in state-space form. The ARMAX process (1) in minimal state-space form is given as [4]
[TABLE]
where
[TABLE]
2.1 Maximum Likelihood Estimation
Given measurements , we denote the MLE of for the ARMAX process (3) with -distribution noise as
[TABLE]
where
[TABLE]
Then the solution to (4) can be found by solving the equation
[TABLE]
where .
Definition 1**.**
Influence Function [6].
The IF of an estimator at distribution is
[TABLE]
where and denotes the probability measure which puts mass 1 at the point .
The IF describes the effect of an infinitesimal contamination at the point on the estimate, standardized by the mass of the contamination. In this paper the noise is assumed to be distributed with pdf . Taking the expectation of (6) gives
[TABLE]
Now we wish to obtain the IF of our estimator. As a first step, we replace in (7) with
[TABLE]
Equation (8) implicitly defines our estimator as a function of and we denote it as . In general, is a nonlinear state estimator. The Taylor series expansion of at is given by
[TABLE]
Definition 1 suggests that the IF of our estimator can be obtained by differentiating equation (8) wrt. and setting
[TABLE]
where we denote as .
The derivation of (10) is given in Appendix A.
2.2 IF Approximation
Lemma 2**.**
The IF of MLE for ARMAX process (3) with -distribution noise (2) is
[TABLE]
When the noise is Gaussian, (11) becomes
[TABLE]
{pf}
See Appendix B.
Since (8) reduces to (6) when , is the solution that we are looking for, namely the solution to (6). We can get an approximate solution to this by retaining only the first two terms of (9). Using (10),
[TABLE]
Remark 3**.**
When the noise is Gaussian, substituting (12) into (13) gives
[TABLE]
which is the least-squares estimator. It is re-assuring that our estimator reduces to the well-known result when the noise is Gaussian. Note that in this case since under the Gaussian noise assumption, the MLE and least-squares estimator coincide.
In the sequel we shall assume that , for simplicity. Then, since , (11) becomes
[TABLE]
and (13) becomes
[TABLE]
Remark 4**.**
*We consider only cases when is asymptotically stable. In this case, is a simplifying assumption since error due to a wrong initial state estimate will reduce to zero after an initial transient. *
3 The Proposed Estimator
Instead of using all the measurements, we adopt the moving horizon principle and use the most recent measurements .
Therefore, the IF of MHE-TD can be obtained by replacing with in (15)
[TABLE]
We denote the proposed estimator as Moving Horizon Estimation of ARMAX process with -distribution noise (MHE-TD). By recursively applying the ARMAX model, we can obtain an expression for the estimate of the current state , and this is given in the next theorem.
Theorem 5**.**
IF approximation of Batch MHE-TD**
Given the ARMAX process (1) where is asymptotically stable and noise with -distribution pdf of (2), the IF approximation of MHE-TD is
[TABLE]
where is given by (5d) and
[TABLE]
{pf}
See Appendix C.
In many cases the observations are obtained sequentially. It is desirable to derive a recursive solution for the IF approximation of MHE-TD as follows.
Theorem 6**.**
IF Approximation of Recursive MHE-TD* The IF approximation of MHE-TD satisfies the following recursive algorithm*
[TABLE]
where
[TABLE]
{pf}
See Appendix D.
Remark 7**.**
When the noise in ARMAX process (1) is Gaussian (i.e., ), in (21b) reduces to . Then reduces to . Thus the recursive MHE-TD reduces to the following recursive solution of the standard Moving Window Least-Squares Estimator (MWLSE).
[TABLE]
Comparing (20a) and (22a) the recursive MHE-TD has the same structure as the standard MWLSE except for a nonlinear transformation, i.e., (21b). When , Equation (22a) reduces to (10) of [11].
The MHE uses only a finite number of past measurement samples; the oldest measurement is discarded as a new sample becomes available. The ARMAX filter [7] uses the so-called growing memory strategy where the past measurement is not discarded as new measurement becomes available. Therefore, the ARMAX filter can be obtained if the last term of (20a) is discarded and and become
[TABLE]
4 Variance of the Proposed Estimator
Let the actual noise have pdf which is different from used in the design of MHE-TD. It is well-known that the mean and variance of an estimator can be analysed by its IF [6, 14, 5]. The statistical properties of the IF approximation of MHE-TD are derived by using IF as follows.
Theorem 8**.**
Variance of IF Approximation of MHE-TD * Consider the estimator (18). Assume that the data are generated from the ARMAX process (1) where the -distribution pdf of is . The variance of the estimate is given by*
[TABLE]
where
[TABLE]
With ,
[TABLE]
{pf}
See Appendix E.
The third term of (24) which depends on is due to the dynamic of the ARAMX model. For a fixed , when increases decreases. Thus the variance decreases with increasing . This is also shown in Table 2.
Corollary 9**.**
The expectation of Moving Window Least-Squares Estimation is given by
[TABLE]
and its variance is
[TABLE]
where
[TABLE]
With ,
[TABLE]
{pf}
By setting in (23).
The assumption that is the same at each sampling instant is commonly made. In the following we extend to the case where could be different at different denoted by . This is useful for the analysis of outliers (see Example 2).
Theorem 10**.**
Property of IF approximation of MHE-TD with outlier* Consider the estimator (18). Assume that the data are generated from the ARMAX process (1) where and the noise distribution contains an outlier at , i.e.,*
[TABLE]
then the expectation of the estimator (18) is
[TABLE]
where
[TABLE]
{pf}
See Appendix F.
Remark 11**.**
The expectation of Moving Window Least-Squares Estimation with outlier can be obtained by setting in (29)
[TABLE]
where is given in (10).
5 Examples
The models and parameters used in the examples are summarized in Table 1. The statistical results in Example 1,2 and 3 are obtained from 1000 Monte Carlo simulation runs.
5.1 Example 1:Estimation Variance
Table 2 shows the variances from 1000 simulation runs. The proposed estimator (Equation (20)), ARMAX filter [7], Kalman filter and MWLSE (Equation (22)) are compared for different horizon length, , and different instance, . An example used to illustrate the calculation can be found in Appendix G.
Firstly, it can be seen that for the proposed estimator the theoretical and simulation variance matched approximately. Secondly, when increases, the variance decreases and therefore can be used as a tuning parameter for estimator performance. Thirdly, the variance of the proposed estimator is less than that of the MWLSE. Fourthly, for small () our proposed estimator gives smaller variances than the Kalman Filter because of more accurate noise modeling. However, for large (), the estimators with growing memory (ARMAX filter and Kalman filter) give smaller estimate variance than the estimators with fixed memory (the proposed estimator and MWLSE). Finally, since the proposed estimator is the moving horizon version of the ARMAX filter, the results for both are the same when . Likewise, the MWLSE is the moving horizon version of the Kalman filter, the results for both are also the same for .
Figure 1 shows that the estimation error of the proposed estimator is smaller than that of MWLSE.
5.2 Example 2: Performance with outlier
An outlier of is introduced at .
Figure 2 shows 1000 simulation runs of the proposed estimator (Equation (20)) and MWLSE (Equation (22)). The yellow curves are the mean values of the 1000 runs. The proposed estimator is less affected by the outlier than the MWLSE. These yellow curves in the top and bottom figures can be calculated from (29) and (31) respectively.
5.3 Example 3: Comparison with particle filter
The proposed estimator is compared with a particle filter (bootstrap implementation)[3]. Taking the numerical solution of MLE (4) as the ground truth, we compared the estimators using the index
[TABLE]
where, for the -th run, is the numerical solution of the MLE (4) and is the estimate of . The indexes, at , are , and for the proposed estimator, and particle filters with 100 and 1000 particles respectively. The computational times per run using MATLAB R2015a with i7-5500U processor @2.4GHz, 16 GB RAM, are of the order of millisecond, one millisecond and ten milliseconds respectively. For about the same performances ( and ) as given by index , the computational time of the proposed estimator is an order of magnitude faster than the particle filter.
6 Conclusion
The IF is employed to give an approximate solution to the moving horizon MLE for ARMAX process with -distribution noise. The approximate solution can be formulated as a recursive MHE-TD algorithm which makes it suitable for on-line and real-time implementation at high sampling rates. We also used the IF to derive a formula for the estimates. The examples show that the proposed estimator gives an estimate with a smaller variance than the MWLSE. It is also less affected by outliers than the MWLSE. For about the same performance index, the computational time of the proposed estimator is an order of magnitude faster than the particle filter.
Appendix A Appendix: Derivation of (10)
Differentiating Equation (8) wrt. gives
[TABLE]
Substituting (7) and
[TABLE]
into (32) gives
[TABLE]
Setting in (33) gives
[TABLE]
We denote as .
Thus we have
[TABLE]
Replacing the integration variable by in (34) gives
[TABLE]
which is (10).
Appendix B Appendix: Proof of Lemma 2
[TABLE]
Differentiating with respect to gives
[TABLE]
Differentiating with respect to gives
[TABLE]
Substituting (35) and (36) into (10) gives
[TABLE]
which is (11).
When we have
[TABLE]
Substituting (37a) and (37b) into (11) and setting e_{k}=y_{k}-H\Phi^{k-1}{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}\bar{x}_{1}}-H{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}s}_{k} give
[TABLE]
which is (12).
Appendix C Appendix: Proof of Theorem 5
Rewriting the state space model (3) of ARMAX as
[TABLE]
where . Then using (16) the current state estimate is obtained by recursively applying the model (38)
[TABLE]
where is given by (5d).
Multiplying (17) by and simplifying give
[TABLE]
Rewriting (40) gives
[TABLE]
where
[TABLE]
Equation (18a) is obtained by substituting (41) into (39).
Appendix D Appendix: Proof of Theorem 6
Define
[TABLE]
Thus the estimator (18a) can be written as
[TABLE]
Notice that Equation (42) gives from the minimization of the Moving Window Least-Squares loss function
[TABLE]
The recursive solution of the above Moving Window Least-Squares [11] is
[TABLE]
where
[TABLE]
In the next time instant
[TABLE]
Expressing (5d) in recursive form we have
[TABLE]
and
[TABLE]
Define z_{k}=w_{k}+H{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}s}_{k}. Substituting w_{{}_{T+1}}=z_{{}_{T+1}}-H{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}s}_{{}_{T+1}} and w_{{}_{T-N+1}}=z_{{}_{T-N+1}}-H{\color[rgb]{0,0,0}\definecolor[named]{pgfstrokecolor}{rgb}{0,0,0}\pgfsys@color@gray@stroke{0}\pgfsys@color@gray@fill{0}s}_{{}_{T-N+1}} into (43) gives
[TABLE]
Substituting (44) and (45) into (LABEL:xhatTT1) gives
[TABLE]
which is Equation (20a), and (20b) follows.
Appendix E Appendix: Proof of Theorem 8
By substituting into (5d) and assuming for simplicity, can be rewritten as
[TABLE]
Substituting (47) into (39) gives
[TABLE]
Taking expectation of (48), recognizing that is zero mean and from (17), we get
[TABLE]
Therefore the variance of is given by
[TABLE]
Substituting (17) into (LABEL:varyhatapp1), noticing that is i.i.d and simplifying, give
[TABLE]
where
[TABLE]
which is (23), and (24) follows.
Appendix F Appendix: Proof of Theorem 10
Recall (47) and
[TABLE]
Then
[TABLE]
which is (10). Let .
Substituting (50) into (40) gives
[TABLE]
From the estimator (39) we have
[TABLE]
[TABLE]
Substituting (51) and (28) into the second term of (53) and simplifying give
[TABLE]
With
[TABLE]
equation (LABEL:intIF2) becomes
[TABLE]
Equation (29) is obtained by substituting (LABEL:intIF3) into (53).
Appendix G Appendix: Numerical Illustration Example
For example, for and , substituting the parameters for Example 1 into (24) gives
[TABLE]
and
[TABLE]
. Substituting the parameters for Example 1 into (LABEL:varyhatgaussian) gives
[TABLE]
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Gabriel Agamennoni, Juan I Nieto, and Eduardo M Nebot. Approximate inference in state-space models with heavy-tailed noise. Signal Processing, IEEE Transactions on , 60(10):5024–5037, 2012.
- 2[2] Gabriel Agamennoni, Juan I Nieto, and Eduardo Mario Nebot. An outlier-robust Kalman filter. In Robotics and Automation (ICRA), 2011 IEEE International Conference on , pages 1551–1558. IEEE, 2011.
- 3[3] M Sanjeev Arulampalam, Simon Maskell, Neil Gordon, and Tim Clapp. A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking. Signal Processing, IEEE Transactions on , 50(2):174–188, 2002.
- 4[4] Karl Johan Åström and Björn Wittenmark. Computer-controlled systems: theory and design . Courier Dover Publications, 2011.
- 5[5] Luisa Turrin Fernholz. On multivariate higher order von mises expansions. Metrika , 53(2):123–140, 2001.
- 6[6] Frank R Hampel, Elvezio M Ronchetti, Peter J Rousseeuw, and Werner A Stahel. Robust statistics: the approach based on influence functions , volume 114. John Wiley & Sons, 2011.
- 7[7] Weng Khuen Ho, Keck Voon Ling, Hoang Dung Vu, and Xiaoqiong Wang. Filtering of the ARMAX process with generalized t-distribution noise: The influence function approach. Industrial & \& Engineering Chemistry Research , 53(17):7019–7028, 2014.
- 8[8] Peter J Huber. Robust statistics . John Wiley & Sons, 1981.
