Convergence analysis of a family of robust Kalman filters based on the contraction principle
Mattia Zorzi

TL;DR
This paper investigates the convergence properties of a family of robust Kalman filters, demonstrating that under certain conditions, the filters reliably converge when model uncertainty is appropriately controlled.
Contribution
It provides a convergence analysis for robust Kalman filters using the contraction principle, linking filter stability to the tolerance parameter and system properties.
Findings
Filters converge when the tolerance parameter is sufficiently small.
The Riccati-like mapping is strictly contractive under the given conditions.
Convergence is guaranteed for reachable and observable models.
Abstract
In this paper we analyze the convergence of a family of robust Kalman filters. For each filter of this family the model uncertainty is tuned according to the so called tolerance parameter. Assuming that the corresponding state-space model is reachable and observable, we show that the corresponding Riccati-like mapping is strictly contractive provided that the tolerance is sufficiently small, accordingly the filter converges.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStability and Control of Uncertain Systems · Optimization and Variational Analysis · Target Tracking and Data Fusion in Sensor Networks
Convergence analysis of a family of robust Kalman filters based on the contraction principle††thanks: This work has been partially supported by the FIRB project “Learning
meets time” (RBFR12M3AC) funded by MIUR.
Mattia Zorzi M. Zorzi is with the Dipartimento di Ingegneria dell’Informazione, Università di Padova, via Gradenigo 6/B, 35131 Padova, Italy, (email: [email protected]).
Abstract
In this paper we analyze the convergence of a family of robust Kalman filters. For each filter of this family the model uncertainty is tuned according to the so called tolerance parameter. Assuming that the corresponding state-space model is reachable and observable, we show that the corresponding Riccati-like mapping is strictly contractive provided that the tolerance is sufficiently small, accordingly the filter converges.
keywords:
Block update, contraction mapping, Kalman filter, Riccati equation, Thompson’s part metric, risk-sensitive filtering
AMS:
60G35, 93B35, 93E11
1 Introduction
Robust Kalman filtering is a computational tool with widespread applications in many fields, e.g. [22]. In this paper we consider the parametric family of robust Kalman filters introduced in [28], see also the former works [17],[16],[9]. The parameter describing this family is denoted by . Once is fixed, the model uncertainty is represented by a ball which is about the nominal model and formed by placing a bound on the -divergence, [27],[25],[26], between the actual and the nominal model. The bound is fixed by the user and represents the tolerance of the mismatch between the actual and the nominal model. Then, the robust filter is obtained by minimizing the mean square error according to the least favorable model in this ball. Interestingly, relaxing the assumption that the actual model belongs to the ball, we obtain a family of risk sensitive filters parametrized by wherein the tolerance parameter is replaced by the so called risk sensitivity parameter. In particular, for we obtain the usual risk sensitive filter, see [3, 19, 21, 12].
In this paper we analyze the convergence of this family of discrete-time robust Kalman filters. More precisely, we prove that the error covariance, obeying to a Riccati-like iteration, converges to a unique positive definite solution.
The convergence of Riccati-like iterations can be performed using classical argumentations, [6, 7, 5]. Alternatively, the convergence analysis can be performed using the contraction principle as in the former paper by Bougerol [4]. More precisely, under reachability and observability assumptions, he proved that the discrete-time Riccati iteration is a strict contraction for the Riemann metric associated to the cone of positive definite matrices. Interestingly, the same result holds using the Thompson’s part metric [15, 8]. The latter metric is more effective than the former in the sense that it gives a tighter bound on the convergence rate of the iteration. It is also worth noting that the contraction principle has been used also to prove the convergence of different kinds of nonlinear iterations [13, 14, 15].
The convergence analysis that we present here is based on the contraction principle. This analysis takes the root from the paper [18]. The latter studies the convergence of the risk sensitive Riccati iteration corresponding to the usual risk sensitive filter. In particular, placing an upper bound on the risk sensitivity parameter it is possible to prove that the -fold composition of the risk sensitive Riccati mapping is strictly contractive for the Thompson’s part metric. Since the robust Kalman filter with can be understood as the usual risk sensitive filter where the risk sensitivity parameter is now time-varying, it is possible to characterize an upper bound on the tolerance of this robust filter in such a way that the time-varying risk sensitivity parameter is sufficiently small. In this way, the -fold composition of the mapping is strictly contractive and thus the robust filter converges, [29]. In this paper we extend these results for the entire family of robust Kalman filters.
The outline of the paper is as follows. In Section 2 we recall the Thompson’s part metric for positive definite matrices and the properties of contraction mappings. In Section 3 we review the robust Kalman filter, we derive the downsampled version and the corresponding -fold Riccati iteration. In this way we are able to derive a condition for which the iteration is strictly contractive. In Section 4 we translate this condition in terms of upper bound on the tolerance of the robust filter. In Section 5 an illustrative example is provided. In Section 6 deals with the convergence analysis of the family of -risk sensitive filters. Finally, we draw the conclusions in Section 7.
Notation. Given , denotes the Euclidean norm of , and denotes the weighted Euclidean norm with weight matrix positive definite. The -th singular value of is denoted by and . denotes the spectral norm of , i.e. . denotes the vector space of symmetric matrices of dimension . The cone of positive definite matrices in is denoted by , and its closure by . denotes the diagonal matrix with elements in the main diagonal ; similarly denotes the block-diagonal matrix with matrices in the main block-diagonal . Given with eigendecomposition such that is an orthogonal matrix and , the exponentiation of to a real number is defined as with . Similarly, we define with and with .
2 Thompson’s part metric and contraction mappings
Let and belong to . The Thompson’s part metric [2] between and is defined as
[TABLE]
Beside all the traditional properties of a distance, has the feature that it is invariant under matrix inversion and congruence transformations.
Let be an arbitrary mapping in . We say that is strictly contractive if its contraction coefficient (or Lipschitz constant)
[TABLE]
is less than one. Since the metric space is complete [20], if is a strict contraction of for the distance , by the Banach fixed point theorem, [1, p. 244], there exists a unique fixed point of in satisfying . Moreover, this fixed point is given by performing the iteration starting with any . Consider the downsampled iteration where and is an integer. Here, is the -fold composition of at step . If is strictly contractive for with fixed, then has a unique fixed point given as before. In this paper we will need the next Lemma [15, Th. 5.3].
Lemma 2.1**.**
Let . Then, the mapping
[TABLE]
is strictly contractive with
[TABLE]
It is worth noting that the results outlined in this Section also hold using the Riemann metric [4]. On the other hand, the Thompson’s part metric is more effective than the Riemann one because it provides a tighter bound on the convergence rate of the previous iteration.
3 Contraction property of the robust Kalman filters
Consider the state-space model
[TABLE]
where is the state process, is the observation process and is white Gaussian noise with unit variance, i.e. . The initial state is assumed to be independent of . Moreover, its nominal probability density is . Model (3) is completely described by the nominal joint Gaussian probability density of and conditioned on . We consider the family of robust Kalman filters [28],[17] parametrized by :
[TABLE]
where is the conditional expectation taken with respect to which is the least-favorable joint Gaussian probability density of and conditioned on . is a ball about the nominal density with radius :
[TABLE]
where is the -divergence family with parameter and defined as follows. Let and be two -dimensional Gaussian probability densities with mean vector and covariance matrix , respectively. Then, the -divergence family is defined as
[TABLE]
where and is such that . Note that, coincides with the Kullback-Leibler divergence for , [24]. To understand the role of parameter in consider the ball . In [27], it has been shown that, increasing and choosing in such a way that the measure of remains constant, then the uncertainty described by increases for the covariance matrix while it decreases for the mean vector. Accordingly, tunes how to allocate the mismodeling budget between the mean vector and the covariance matrix. is referred to as tolerance and measures the model uncertainty. is the class of estimators with finite second-order moments with respect to all densities . The resulting estimator obeys the recursion:
[TABLE]
where is the gain matrix
[TABLE]
If denotes the state prediction error at time , its pseudo-nominal and least-favorable covariance matrix is denoted by and , respectively. Then, the latter obey to the Riccati-like iteration:
[TABLE]
where is such that and is the unique solution to
[TABLE]
where is defined as:
[TABLE]
is called risk sensitivity parameter and it is time-varying. In the case that , i.e. no uncertainty in the nominal model, we obtain the usual Kalman filter. Regarding the performance analysis of this family of robust Kalman filters with respect to parameter we refer to [28]. It is worth noting, in view of (11), we have that . To study the asymptotic behavior of this robust Kalman filter, the matrices , , , and the tolerance are assumed to be constant. Without loss of generality we assume that . Otherwise, we can rewrite the filter (6)-(17) with , such that , and . In this way . Substituting (7) in (8) and using the Woodbury formula, we obtain the Riccati-like iteration
[TABLE]
Defining the positive definite matrix
[TABLE]
we have
[TABLE]
The mapping in (19) has the same structure of the risk sensitive Riccati mapping, [21]. Accordingly, the robust filter (6)-(17) can be interpreted as solving a standard least-square filtering problem with time-varying parameters in Krein space, [10, 11]. The Krein state-space model consists of dynamics and observations in (3), to which we must adjoin the new observations . The components of noise vectors and now belong to a Krein space and have the inner product
[TABLE]
where denotes the Kronecker delta function. Since is Gauss-Markov, the downsampled process , with integer, is also Gauss-Markov with state-space model
[TABLE]
where
[TABLE]
In model (27) we have
[TABLE]
Note that, and denote, respectively, the -block reachability and observability matrices of model (3), where the blocks forming are written from bottom to top instead of the usual top to bottom convention. In (27), if
[TABLE]
and are block Toeplitz matrices defined as follows
[TABLE]
We define
[TABLE]
Along similar lines used in [29], it is not difficult to see that the time-varying Riccati iteration associated to the downsampled model (27) takes the form where
[TABLE]
with
[TABLE]
where we exploited the fact that and because .
Proposition 3.1**.**
Let
[TABLE]
Assume that the pairs and are reachable and observable, respectively. Then, there exits , with and , such that if then and are positive definite.
Proof.
It is not difficult to see that is positive definite and negative definite for . The mapping is nondecreasing with respect to the partial order of symmetric matrices over because its first variation along a direction is
[TABLE]
Note that, which is positive definite for because the pair is reachable and thus has full row rank. Accordingly, is positive definite for . The mapping is nonincreasing for because its first variation along is
[TABLE]
Moreover, which is positive definite for because the pair is observable. Accordingly, there exists a constant such that and both and are positive definite for .
Remark 3.1**.**
By the proof of Proposition 3.1, one can see that can be computed as follows: set and check whether is positive definite or not. If not, we decrease until becomes positive definite.
By Lemma 2.1, the mapping is strictly contractive provided that the matrices and are positive definite. In view of Proposition 3.1, if for some fixed the following condition holds
[TABLE]
then the -fold composition is strictly contractive for and thus is strictly contractive as well.
4 Characterization of the range of the tolerance
In this Section, we characterize a range of for which condition (64) holds. The proofs of this Section only consider the case because the results for the case can be proved along similar lines, and the case has been already proved in [29]. Condition (64) is equivalent to the condition
[TABLE]
for some fixed. Through the next two Lemmas we will be able to derive a condition on which implies condition (65).
Lemma 4.1**.**
Let , with , be the convergent iteration generated by the usual Riccati mapping
[TABLE]
Consider the sequence generated by (18). Then, , with for any .
Proof.
It is well known that the sequence is nondecreasing with respect to the partial order of the symmetric matrices. Accordingly, it is sufficient to prove that . For this aim, we define the risk sensitive Riccati mapping, [21],
[TABLE]
where is a positive semidefinite matrix. For , we have . Assume that , then
[TABLE]
where we exploited the fact that for any positive semidefinite and such that , [21], and the fact that is a nondecreasing mapping with respect to the partial order of the symmetric matrices.
Lemma 4.2**.**
Let be such that , then
[TABLE]
Proof.
Consider the function
[TABLE]
defined over the set and . Then,
[TABLE]
where
[TABLE]
It is not difficult to see that
[TABLE]
which is nonpositive for . Accordingly, is a nonincreasing function over and
[TABLE]
Accordingly, the first derivative of in (69) is nonpositive over , i.e. is nonincreasing over .
Let be the singular value decomposition of , hence , and positive definite. Therefore, we have
[TABLE]
Since the singular value decomposition of is , we have
[TABLE]
By assumption, , , therefore we have , . Accordingly, which concludes the proof.
Fixed , by Lemma 4.1, for the sequence generated by (18) we have , , and by Lemma 4.2 we have
[TABLE]
Therefore, the condition
[TABLE]
or equivalently
[TABLE]
implies (65). In particular, for we obtain
[TABLE]
The next Lemma is needed to derive a condition on which implies condition (70), and thus also condition (65).
Lemma 4.3**.**
Assuming that , the following facts hold:
* is monotone increasing over * 2. 2.
If then 3. 3.
* for any with .*
Proof.
-
The statement has been proved in [27].
-
First, note that
[TABLE]
To prove the statement, we show that the first variation of with respect to in any direction is nonnegative:
[TABLE]
where we exploited the fact that and commutes.
- is equal to the -divergence between the covariance matrices and , [23]. Since , we get .
We know that , which is equivalent to say . Then, by Lemma 4.3, condition implies that
[TABLE]
Figure 1 shows this situation. Thus, (65) holds if we choose in a such way that .
Theorem 4.1**.**
Let model (3) be such that and are reachable and observable, respectively. Let be such that with
[TABLE]
* and are fixed. Then, for any , the sequence generated by iteration (18) converges to a unique solution . Moreover, the limit of the filtering gain as has the property that is stable.*
Proof.
Since
[TABLE]
by Lemma 4.3 we have that (70) holds for and therefore for . Accordingly, the mapping is strictly contractive for . Since is the -fold composition of , it follows that the sequence generated by (18) converges. By (12) the convergence of implies the convergence of to a unique value . Thus, (11) implies the convergence of to a unique solution . Finally, the stability of can be proved by applying the Lyapunov stability theory to the algebraic Riccati-like equation
[TABLE]
Finally, it is not difficult to show that the mapping
[TABLE]
is nondecreasing. Thus, we have to choose sufficiently large in order to find a bigger .
5 Example
We consider the constant state space model (3) used in [18],
[TABLE]
The error covariance matrix at time is chosen as . We study the convergence of filter (6)-(17) with three different values for : , and . Fixing , we found that
[TABLE]
Moreover, the robust Kalman filter (6)-(17) converges with tolerance in the range where
[TABLE]
Now, we compare the performances of the following three filters:
- •
KF: the standard Kalman filter
- •
RKF0: the robust Kalman filter with and
- •
RKF05: the robust Kalman filter with and
- •
RKF1: the robust Kalman filter with and
that is we consider the robust Kalman filter with , , with the corresponding maximum tolerance for which we know that it converges. In Figure 2 we show the pseudo-nominal variance of the state estimation error of the first component of the state, that is the entry of in position (1,1).
In Figure 3 we show the pseudo-nominal variance of the state estimation error of the second component of , that is the entry in position (2,2) of .
Roughly speaking these quantities represent the error variance computed using the nominal density but propagating the previous least favorable density . The previous figures show that the Riccati-like iteration converges after 20 steps for , and . In Figure 4, we show the time-varying risk-sensitivity parameter which after 20 steps is already constant.
In Figure 5
and Figure 6 we consider the corresponding least-favorable error variance, i.e. the error variance is computed by using the least-favorable density and propagating the previous least favorable density .
It is clear that RKF0, RKF05 and RKF1 are very conservative with respect to the KF, i.e. their error variances are larger than the ones given by KF. This means that, although the upper bound we found is not tight, the range contains a sufficiently large class of robust estimators. In other words, with c close to zero we have robust Kalman filters with performance similar to KF, while with close to we have robust Kalman filters very different than KF.
6 Convergence analysis of the -risk sensitive filters
Consider the state-space model (3) and the corresponding nominal joint Gaussian probability density . The family of risk sensitive filters [28] parametrized by is given by
[TABLE]
where is Gaussian, and is the set of estimators for which the objective function in (85) is finite. is the risk sensitivity parameter. The second term in the objective function in (85) is always nonpositive because . Therefore, for large values of the maximizer has the possibility to take a probability density far from the nominal one. The -risk sensitive filter (85) thus represents a relaxed version of the robust Kalman filter (6)-(17) where now is constant and fixed by the user. For the case we obtain the usual risk sensitive filter [3]. The resulting estimator obeys the recursion (6)-(8) with
[TABLE]
The study of the asymptotic behavior of the -risk sensitive filter requires to consider two different cases: the case and the case .
In the former case, the Riccati-like iteration has the same form of (18) but the image of under this mapping is not entirely contained in . The reason is that condition holds only if is such that and this condition could be not satisfied. Following similar arguments used in [18] for the case , it is possible to find conditions on and for which the trajectory of iteration (18) satisfies for any . However, these conditions on and are rather intricate and require to design a gain matrix and a scaling factor .
For the case , is positive definite, and thus well defined, whenever is positive definite. Accordingly, the image of under the corresponding mapping, denoted by , is . Thus, the convergence of the iteration is guaranteed by only imposing conditions on the risk sensitivity parameter .
Theorem 6.1**.**
Let model (3) be such that and are reachable and observable, respectively. Let be such that
[TABLE]
* and are fixed. Then, for any , the sequence generated by the risk-sensitive filter with converges to a unique solution . Moreover, the limit of the filtering gain as has the property that is stable.*
Proof.
We consider the downsampled process with and the corresponding time-varying Riccati-iteration is where has the same structure of (60). Let . Proposition 3.1 still holds. In particular, there exists such that if (65) holds then the matrices and are positive definite. Accordingly, by Lemma 2.1 the -fold mapping is strictly contractive and thus also is strictly contractive. Lemma 4.1 and Lemma 4.2 still hold, in particular
[TABLE]
Finally, by imposing
[TABLE]
which coincides with (89), then condition (65) holds. Thus, the sequence converges to a unique as . The stability of follows as before.
It is clear that condition (89) on the risk sensitivity parameter is easy to check. Accordingly, this filter is preferable than the risk-sensitive filter with .
7 Conclusions
A convergence analysis of a family of robust Kalman filters has been presented. This analysis exploited the fact that the -fold Riccati mapping, which is given by downsampling these filters, is strictly contractive provided that the time-varying risk-sensitive parameter is sufficiently small. This condition is then guaranteed by placing an upper bound on the tolerance parameter of the robust filters. Finally, we have studied the convergence property of a family of risk-sensitive filters which can be understood as a relaxed version of the previous robust Kalman filters.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] J. Aubin and I. Ekeland , Applied Nonlinear Analysis , J. Wiley, New York, 1984.
- 2[2] R. Bhatia , On the exponential metric increasing property , Linear Algebra and its Appl., 375 (2003), pp. 211–220.
- 3[3] R. Boel, M. James, and I. Petersen , Robustness and risk-sensitive filtering , IEEE Trans. Automat. Control, 47 (2002), pp. 451–461.
- 4[4] P. Bougerol , Kalman filtering with random coefficients and contractions , SIAM J. Control and Optimiz., 31 (1993), pp. 942–959.
- 5[5] A. Ferrante and B. Levy , Hermitian solutions of the equation x = q + n x − 1 n ∗ 𝑥 𝑞 𝑛 superscript 𝑥 1 superscript 𝑛 x=q+nx^{-1}n^{*} , Linear Algebra and its Applications, 247 (1996), pp. 359–373.
- 6[6] A. Ferrante and L. Ntogramatzidis , The generalised discrete algebraic riccati equation in linear-quadratic optimal control , Automatica, 49 (2013), pp. 471–478.
- 7[7] A. Ferrante and L. Ntogramatzidis , The generalized continuous algebraic riccati equation and impulse-free continuous-time LQ optimal control , Automatica, 50 (2014), pp. 1176–1180.
- 8[8] S. Gaubert and Z. Qu , The contraction rate in thompson’s part metric of order-preserving flows on a cone - application to generalized riccati equations , Journal of Differential Equations, 256 (2014), pp. 2902–2948.
