Convergence analysis of a family of robust Kalman filters based on the   contraction principle

Mattia Zorzi

arXiv:1705.05286·math.OC·May 16, 2017·SIAM J. Control. Optim.

Convergence analysis of a family of robust Kalman filters based on the contraction principle

Mattia Zorzi

PDF

Open Access

TL;DR

This paper investigates the convergence properties of a family of robust Kalman filters, demonstrating that under certain conditions, the filters reliably converge when model uncertainty is appropriately controlled.

Contribution

It provides a convergence analysis for robust Kalman filters using the contraction principle, linking filter stability to the tolerance parameter and system properties.

Findings

01

Filters converge when the tolerance parameter is sufficiently small.

02

The Riccati-like mapping is strictly contractive under the given conditions.

03

Convergence is guaranteed for reachable and observable models.

Abstract

In this paper we analyze the convergence of a family of robust Kalman filters. For each filter of this family the model uncertainty is tuned according to the so called tolerance parameter. Assuming that the corresponding state-space model is reachable and observable, we show that the corresponding Riccati-like mapping is strictly contractive provided that the tolerance is sufficiently small, accordingly the filter converges.

Figures6

Click any figure to enlarge with its caption.

Equations150

d_{T} (P, Q)

d_{T} (P, Q)

= max {lo g (σ_{1} (P^{- 1} Q)), lo g (σ_{1} (Q^{- 1} P))} .

ξ (h) = P, Q \in Q_{n}^{+}, P \neq = Q sup \frac{d _{T} ( h ( P ) , h ( Q ))}{d _{T} ( P , Q )}

ξ (h) = P, Q \in Q_{n}^{+}, P \neq = Q sup \frac{d _{T} ( h ( P ) , h ( Q ))}{d _{T} ( P , Q )}

h (P) = M (P^{- 1} + W_{1})^{- 1} M^{T} + W_{2}

h (P) = M (P^{- 1} + W_{1})^{- 1} M^{T} + W_{2}

ξ (h) \leq \frac{σ _{1} ( W _{1}^{- 1} M ^{T} W _{2}^{- 1} M )}{1 + 1 + σ _{1} ( W _{1}^{- 1} M ^{T} W _{2}^{- 1} M )}^{2} .

ξ (h) \leq \frac{σ _{1} ( W _{1}^{- 1} M ^{T} W _{2}^{- 1} M )}{1 + 1 + σ _{1} ( W _{1}^{- 1} M ^{T} W _{2}^{- 1} M )}^{2} .

x_{k + 1}

x_{k + 1}

y_{k}

\overset{x}{^}_{k + 1} = g_{k} \in G_{k} argmin \tilde{f}_{k} \in B_{k, τ}^{c} max E_{\tilde{f}_{k}} [∥ x_{k + 1} - g_{k} (y_{k}) ∥^{2} ∣ Y_{k - 1}]

\overset{x}{^}_{k + 1} = g_{k} \in G_{k} argmin \tilde{f}_{k} \in B_{k, τ}^{c} max E_{\tilde{f}_{k}} [∥ x_{k + 1} - g_{k} (y_{k}) ∥^{2} ∣ Y_{k - 1}]

B_{k, τ}^{c} = {\tilde{f}_{k} (z_{k} ∣ Y_{k - 1}) s.t. D_{τ} (\tilde{f}_{k} ∥ f_{k}) \leq c}

B_{k, τ}^{c} = {\tilde{f}_{k} (z_{k} ∣ Y_{k - 1}) s.t. D_{τ} (\tilde{f}_{k} ∥ f_{k}) \leq c}

D_{τ}

D_{τ}

\displaystyle\left\{\begin{array}[]{ll}\|\Delta m_{z}\|^{2}_{K_{Z}^{-1}}+\mathop{\rm tr}\left(-\log(\tilde{K}_{z}K_{z}^{-1})+\tilde{K}_{z}K_{z}^{-1}-I_{q}\right),&\hbox{$\tau=0$}\\ \frac{1}{1-\tau}\|\Delta m_{z}\|^{2}_{K_{Z}^{-1}}+\mathop{\rm tr}\left(\frac{1}{\tau(\tau-1)}(L_{z}^{-1}\tilde{K}_{z}L_{z}^{-T})^{\tau}+\frac{1}{1-\tau}\tilde{K}_{z}K_{z}^{-1}+\frac{1}{\tau}I_{q}\right),&\hbox{$0<\tau<1$}\\ \delta(\Delta m_{z})+\mathop{\rm tr}\left(L_{z}^{-1}\tilde{K}_{z}L_{z}^{-T}\log(L_{z}^{-1}\tilde{K}_{z}L_{z}^{-T})-\tilde{K}_{z}K_{z}^{-1}+I_{q}\right),&\hbox{$\tau=1$ }\end{array}\right.

\overset{x}{^}_{k + 1} = A \overset{x}{^}_{k} + G_{k} (y_{k} - C \overset{x}{^}_{k})

\overset{x}{^}_{k + 1} = A \overset{x}{^}_{k} + G_{k} (y_{k} - C \overset{x}{^}_{k})

G_{k}

G_{k}

P_{k + 1}

P_{k + 1}

V_{k + 1}

c = γ_{τ} (P_{k + 1}, θ_{k})

c = γ_{τ} (P_{k + 1}, θ_{k})

\displaystyle\gamma_{\tau}(P,\theta)=\left\{\begin{array}[]{ll}-\log\det(I_{n}-\theta P)^{-1}+\mathop{\rm tr}((I_{n}-\theta P)^{-1}-I_{n}),&\tau=0\\ \mathop{\rm tr}(-\frac{1}{\tau(1-\tau)}(I_{n}-\theta(1-\tau)L_{P}^{T}L_{P})^{\frac{\tau}{\tau-1}}\\ \hskip 8.5359pt+\frac{1}{1-\tau}(I_{n}-\theta(1-\tau)L_{P}^{T}L_{P})^{\frac{1}{\tau-1}}+\frac{1}{\tau}I_{n}),&0<\tau<1\\ \mathop{\rm tr}(\exp(\theta L_{P}^{T}L_{P})(\theta L_{P}^{T}L_{P}-I_{n})+I_{n}),&\tau=1.\end{array}\right.

\displaystyle\gamma_{\tau}(P,\theta)=\left\{\begin{array}[]{ll}-\log\det(I_{n}-\theta P)^{-1}+\mathop{\rm tr}((I_{n}-\theta P)^{-1}-I_{n}),&\tau=0\\ \mathop{\rm tr}(-\frac{1}{\tau(1-\tau)}(I_{n}-\theta(1-\tau)L_{P}^{T}L_{P})^{\frac{\tau}{\tau-1}}\\ \hskip 8.5359pt+\frac{1}{1-\tau}(I_{n}-\theta(1-\tau)L_{P}^{T}L_{P})^{\frac{1}{\tau-1}}+\frac{1}{\tau}I_{n}),&0<\tau<1\\ \mathop{\rm tr}(\exp(\theta L_{P}^{T}L_{P})(\theta L_{P}^{T}L_{P}-I_{n})+I_{n}),&\tau=1.\end{array}\right.

P_{k + 1} = r_{τ, c} (P_{k}) := A (V_{k}^{- 1} + C^{T} (D D^{T})^{- 1} C)^{- 1} A^{T} + B B^{T} .

P_{k + 1} = r_{τ, c} (P_{k}) := A (V_{k}^{- 1} + C^{T} (D D^{T})^{- 1} C)^{- 1} A^{T} + B B^{T} .

Φ_{k} = P_{k + 1}^{- 1} - V_{k + 1}^{- 1}

Φ_{k} = P_{k + 1}^{- 1} - V_{k + 1}^{- 1}

r_{τ, c} (P_{k})

r_{τ, c} (P_{k})

\displaystyle\left\langle\left[\begin{array}[]{c}v_{k}\\ u_{k}\end{array}\right]\,,\,\left[\begin{array}[]{c}v_{s}\\ u_{s}\end{array}\right]\right\rangle=\left[\begin{array}[]{cc}I_{m}&0\\ 0&-\Phi_{k-1}^{-1}\\ \end{array}\right]\delta_{k-s}

\displaystyle\left\langle\left[\begin{array}[]{c}v_{k}\\ u_{k}\end{array}\right]\,,\,\left[\begin{array}[]{c}v_{s}\\ u_{s}\end{array}\right]\right\rangle=\left[\begin{array}[]{cc}I_{m}&0\\ 0&-\Phi_{k-1}^{-1}\\ \end{array}\right]\delta_{k-s}

x_{k + 1}^{d}

x_{k + 1}^{d}

y_{k}^{N}

0

v_{k}^{N}

v_{k}^{N}

u_{k}^{N}

y_{k}^{N}

R_{N}

R_{N}

O_{N}

O_{N}^{R}

D_{N}

H_{k}

H_{k}

L_{k}

H_{N}

H_{N}

L_{N}

J_{N}

J_{N}

Ω_{N}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStability and Control of Uncertain Systems · Optimization and Variational Analysis · Target Tracking and Data Fusion in Sensor Networks

Full text

Convergence analysis of a family of robust Kalman filters based on the contraction principle††thanks: This work has been partially supported by the FIRB project “Learning

meets time” (RBFR12M3AC) funded by MIUR.

Mattia Zorzi M. Zorzi is with the Dipartimento di Ingegneria dell’Informazione, Università di Padova, via Gradenigo 6/B, 35131 Padova, Italy, (email: [email protected]).

Abstract

In this paper we analyze the convergence of a family of robust Kalman filters. For each filter of this family the model uncertainty is tuned according to the so called tolerance parameter. Assuming that the corresponding state-space model is reachable and observable, we show that the corresponding Riccati-like mapping is strictly contractive provided that the tolerance is sufficiently small, accordingly the filter converges.

keywords:

Block update, contraction mapping, Kalman filter, Riccati equation, Thompson’s part metric, risk-sensitive filtering

AMS:

60G35, 93B35, 93E11

1 Introduction

Robust Kalman filtering is a computational tool with widespread applications in many fields, e.g. [22]. In this paper we consider the parametric family of robust Kalman filters introduced in [28], see also the former works [17],[16],[9]. The parameter describing this family is denoted by $\tau$ . Once $\tau$ is fixed, the model uncertainty is represented by a ball which is about the nominal model and formed by placing a bound on the $\tau$ -divergence, [27],[25],[26], between the actual and the nominal model. The bound is fixed by the user and represents the tolerance of the mismatch between the actual and the nominal model. Then, the robust filter is obtained by minimizing the mean square error according to the least favorable model in this ball. Interestingly, relaxing the assumption that the actual model belongs to the ball, we obtain a family of risk sensitive filters parametrized by $\tau$ wherein the tolerance parameter is replaced by the so called risk sensitivity parameter. In particular, for $\tau=0$ we obtain the usual risk sensitive filter, see [3, 19, 21, 12].

In this paper we analyze the convergence of this family of discrete-time robust Kalman filters. More precisely, we prove that the error covariance, obeying to a Riccati-like iteration, converges to a unique positive definite solution.

The convergence of Riccati-like iterations can be performed using classical argumentations, [6, 7, 5]. Alternatively, the convergence analysis can be performed using the contraction principle as in the former paper by Bougerol [4]. More precisely, under reachability and observability assumptions, he proved that the discrete-time Riccati iteration is a strict contraction for the Riemann metric associated to the cone of positive definite matrices. Interestingly, the same result holds using the Thompson’s part metric [15, 8]. The latter metric is more effective than the former in the sense that it gives a tighter bound on the convergence rate of the iteration. It is also worth noting that the contraction principle has been used also to prove the convergence of different kinds of nonlinear iterations [13, 14, 15].

The convergence analysis that we present here is based on the contraction principle. This analysis takes the root from the paper [18]. The latter studies the convergence of the risk sensitive Riccati iteration corresponding to the usual risk sensitive filter. In particular, placing an upper bound on the risk sensitivity parameter it is possible to prove that the $N$ -fold composition of the risk sensitive Riccati mapping is strictly contractive for the Thompson’s part metric. Since the robust Kalman filter with $\tau=0$ can be understood as the usual risk sensitive filter where the risk sensitivity parameter is now time-varying, it is possible to characterize an upper bound on the tolerance of this robust filter in such a way that the time-varying risk sensitivity parameter is sufficiently small. In this way, the $N$ -fold composition of the mapping is strictly contractive and thus the robust filter converges, [29]. In this paper we extend these results for the entire family of robust Kalman filters.

The outline of the paper is as follows. In Section 2 we recall the Thompson’s part metric for positive definite matrices and the properties of contraction mappings. In Section 3 we review the robust Kalman filter, we derive the downsampled version and the corresponding $N$ -fold Riccati iteration. In this way we are able to derive a condition for which the iteration is strictly contractive. In Section 4 we translate this condition in terms of upper bound on the tolerance of the robust filter. In Section 5 an illustrative example is provided. In Section 6 deals with the convergence analysis of the family of $\tau$ -risk sensitive filters. Finally, we draw the conclusions in Section 7.

Notation. Given $x\in\mathbb{R}^{n}$ , $\|x\|$ denotes the Euclidean norm of $x$ , and $\|x\|_{K}$ denotes the weighted Euclidean norm with weight matrix $K$ positive definite. The $i$ -th singular value of $P\in\mathbb{R}^{n\times n}$ is denoted by $\sigma_{i}(P)$ and $\sigma_{1}(P)\geq\sigma_{2}(P)\geq\ldots\geq\sigma_{n}(P)$ . $\|P\|$ denotes the spectral norm of $P$ , i.e. $\|P\|=\sigma_{1}(P)$ . $\mathcal{Q}^{n}$ denotes the vector space of symmetric matrices of dimension $n\times n$ . The cone of positive definite matrices in $\mathcal{Q}^{n}$ is denoted by $\mathcal{Q}_{+}^{n}$ , and its closure by $\bar{\mathcal{Q}}_{+}^{n}$ . $\mathop{\operator@font diag}\nolimits(d_{1}\ldots d_{n})$ denotes the diagonal matrix with elements in the main diagonal $d_{1},\ldots,d_{n}$ ; similarly $\mathrm{blkdiag}(P_{1}\ldots P_{n})$ denotes the block-diagonal matrix with matrices in the main block-diagonal $P_{1},\ldots,P_{n}$ . Given $P\in\mathcal{Q}_{+}^{n}$ with eigendecomposition $P=UDU^{T}$ such that $U$ is an orthogonal matrix and $D=\mathop{\operator@font diag}\nolimits(\sigma_{1}(P)\ldots\sigma_{n}(P))$ , the exponentiation of $P$ to a real number $\tau$ is defined as $P^{\tau}=UD^{\tau}U^{T}$ with $D^{\tau}=\mathop{\operator@font diag}\nolimits(\sigma_{1}(P)^{\tau}\ldots\sigma_{n}(P)^{\tau})$ . Similarly, we define $\exp(P)=U\exp(D)U^{T}$ with $\exp(D)=\mathop{\operator@font diag}\nolimits(e^{\sigma_{1}(P)}\ldots e^{\sigma_{n}(P)})$ and $\log(P)=U\log(D)U^{T}$ with $\log(D)=\mathop{\operator@font diag}\nolimits(\log(\sigma_{1}(P))\ldots\log(\sigma_{n}(P)))$ .

2 Thompson’s part metric and contraction mappings

Let $P$ and $Q$ belong to $\mathcal{Q}_{+}^{n}$ . The Thompson’s part metric [2] between $P$ and $Q$ is defined as

[TABLE]

Beside all the traditional properties of a distance, $d_{T}$ has the feature that it is invariant under matrix inversion and congruence transformations.

Let $h(\cdot)$ be an arbitrary mapping in $\mathcal{Q}_{+}^{n}$ . We say that $h$ is strictly contractive if its contraction coefficient (or Lipschitz constant)

[TABLE]

is less than one. Since the metric space $(\mathcal{Q}_{+}^{n},d_{T})$ is complete [20], if $h$ is a strict contraction of $\mathcal{Q}_{+}^{n}$ for the distance $d_{T}$ , by the Banach fixed point theorem, [1, p. 244], there exists a unique fixed point $P$ of $h$ in $\mathcal{Q}_{+}^{n}$ satisfying $P=h(P)$ . Moreover, this fixed point is given by performing the iteration $P_{k+1}=h(P_{k})$ starting with any $P_{0}\in\mathcal{Q}_{+}^{n}$ . Consider the downsampled iteration $P_{k+1}^{d}=h_{k}^{N}(P_{k}^{d})$ where $P_{k}^{d}=P_{kN}$ and $N$ is an integer. Here, $h_{k}^{N}$ is the $N$ -fold composition of $h$ at step $kN$ . If $h_{k}^{N}$ is strictly contractive for $k\geq\tilde{q}$ with $\tilde{q}$ fixed, then $h$ has a unique fixed point given as before. In this paper we will need the next Lemma [15, Th. 5.3].

Lemma 2.1.

Let $W_{1},W_{2}\in\mathcal{Q}_{+}^{n}$ . Then, the mapping

[TABLE]

is strictly contractive with

[TABLE]

It is worth noting that the results outlined in this Section also hold using the Riemann metric [4]. On the other hand, the Thompson’s part metric is more effective than the Riemann one because it provides a tighter bound on the convergence rate of the previous iteration.

3 Contraction property of the robust Kalman filters

Consider the state-space model

[TABLE]

where $x_{k}\in\mathbb{R}^{n}$ is the state process, $y_{k}\in\mathbb{R}^{p}$ is the observation process and $v_{k}\in\mathbb{R}^{m}$ is white Gaussian noise with unit variance, i.e. $\mathbb{E}[v_{k}v_{k}^{T}]=I_{m}$ . The initial state $x_{0}$ is assumed to be independent of $v_{k}$ . Moreover, its nominal probability density is $f_{0}(x_{0})\sim\mathcal{N}(\hat{x}_{0},V_{0})$ . Model (3) is completely described by the nominal joint Gaussian probability density $f_{k}(x_{k+1},y_{k}|Y_{k-1})$ of $x_{k+1}$ and $y_{k}$ conditioned on $Y_{k-1}:=[\,y_{0}\,\ldots y_{k-1}\,]^{T}$ . We consider the family of robust Kalman filters [28],[17] parametrized by $\tau\in[0,1]$ :

[TABLE]

where $\mathbb{E}_{\tilde{f}_{k}}[\cdot|Y_{k-1}]$ is the conditional expectation taken with respect to $\tilde{f}_{k}(x_{k+1},y_{k}|Y_{k-1})$ which is the least-favorable joint Gaussian probability density of $x_{k+1}$ and $y_{k}$ conditioned on $Y_{k-1}$ . $\mathcal{B}^{c}_{k,\tau}$ is a ball about the nominal density $f_{k}(x_{k+1},y_{k}|Y_{k-1})$ with radius $c$ :

[TABLE]

where $\mathcal{D}_{\tau}$ is the $\tau$ -divergence family with parameter $\tau\in[0,1]$ and defined as follows. Let $\tilde{f}$ and $f$ be two $q$ -dimensional Gaussian probability densities with mean vector $\tilde{m}_{z},m_{z}$ and covariance matrix $\tilde{K}_{z},K_{z}$ , respectively. Then, the $\tau$ -divergence family is defined as

[TABLE]

where $\Delta m_{z}=m_{z}-\tilde{m}_{z}$ and $L_{z}$ is such that $K_{z}=L_{z}L^{T}_{z}$ . Note that, $\mathcal{D}_{\tau}(\tilde{f}\|f)$ coincides with the Kullback-Leibler divergence for $\tau=0$ , [24]. To understand the role of parameter $\tau$ in $\mathcal{B}_{k,\tau}^{c}$ consider the ball $\mathcal{B}_{\tau}^{c}:=\{\tilde{f}\,:\,\mathcal{D}_{\tau}(\tilde{f}\|f)\leq c\}$ . In [27], it has been shown that, increasing $\tau$ and choosing $c$ in such a way that the measure of $\mathcal{B}_{\tau}^{c}$ remains constant, then the uncertainty described by $\mathcal{B}_{\tau}^{c}$ increases for the covariance matrix while it decreases for the mean vector. Accordingly, $\tau$ tunes how to allocate the mismodeling budget between the mean vector and the covariance matrix. $c$ is referred to as tolerance and measures the model uncertainty. $\mathcal{G}_{k}$ is the class of estimators with finite second-order moments with respect to all densities $\tilde{f}_{k}(x_{k+1},y_{k}|Y_{k-1})\in\mathcal{B}^{c}_{k,\tau}$ . The resulting estimator obeys the recursion:

[TABLE]

where $G_{k}$ is the gain matrix

[TABLE]

If $x_{k}-\hat{x}_{k}$ denotes the state prediction error at time $k$ , its pseudo-nominal and least-favorable covariance matrix is denoted by $P_{k}$ and $V_{k}$ , respectively. Then, the latter obey to the Riccati-like iteration:

[TABLE]

where $L_{P_{k+1}}$ is such that $P_{k+1}=L_{P_{k+1}}L_{P_{k+1}}^{T}$ and $\theta_{k}^{-1}>(1-\tau)\|P_{k+1}\|$ is the unique solution to

[TABLE]

where $\gamma_{\tau}$ is defined as:

[TABLE]

$\theta_{k}$ is called risk sensitivity parameter and it is time-varying. In the case that $c=0$ , i.e. no uncertainty in the nominal model, we obtain the usual Kalman filter. Regarding the performance analysis of this family of robust Kalman filters with respect to parameter $\tau$ we refer to [28]. It is worth noting, in view of (11), we have that $P_{k+1}<V_{k+1}$ . To study the asymptotic behavior of this robust Kalman filter, the matrices $A$ , $B$ , $C$ , $D$ and the tolerance $c$ are assumed to be constant. Without loss of generality we assume that $BD^{T}=0$ . Otherwise, we can rewrite the filter (6)-(17) with $\tilde{A}=A-BD^{T}(DD^{T})^{-1}C$ , $\tilde{B}$ such that $\tilde{B}\tilde{B}^{T}=B(I-D^{T}(DD^{T})^{-1}D)B^{T}$ , $\tilde{C}=C$ and $\tilde{D}=D$ . In this way $\tilde{B}\tilde{D}^{T}=0$ . Substituting (7) in (8) and using the Woodbury formula, we obtain the Riccati-like iteration

[TABLE]

Defining the positive definite matrix

[TABLE]

we have

[TABLE]

The mapping in (19) has the same structure of the risk sensitive Riccati mapping, [21]. Accordingly, the robust filter (6)-(17) can be interpreted as solving a standard least-square filtering problem with time-varying parameters in Krein space, [10, 11]. The Krein state-space model consists of dynamics and observations in (3), to which we must adjoin the new observations $0=x_{k}+u_{k}$ . The components of noise vectors $v_{k}$ and $u_{k}$ now belong to a Krein space and have the inner product

[TABLE]

where $\delta_{k}$ denotes the Kronecker delta function. Since $x_{k}$ is Gauss-Markov, the downsampled process $x_{k}^{d}:=x_{kN}$ , with $N$ integer, is also Gauss-Markov with state-space model

[TABLE]

where

[TABLE]

In model (27) we have

[TABLE]

Note that, ${\cal R}_{N}$ and ${\cal O}_{N}$ denote, respectively, the $N$ -block reachability and observability matrices of model (3), where the blocks forming ${\cal O}_{N}$ are written from bottom to top instead of the usual top to bottom convention. In (27), if

[TABLE]

${\cal H}_{N}$ and ${\cal L}_{N}$ are block Toeplitz matrices defined as follows

[TABLE]

We define

[TABLE]

Along similar lines used in [29], it is not difficult to see that the time-varying Riccati iteration associated to the downsampled model (27) takes the form $P_{k+1}^{d}=r_{\tau,c,k}^{d}(P_{k}^{d})$ where

[TABLE]

with

[TABLE]

where we exploited the fact that $\mathcal{D}_{N}\mathcal{H}_{N}^{T}=0$ and $\mathcal{D}_{N}\mathcal{L}_{N}^{T}=0$ because $BD^{T}=0$ .

Proposition 3.1.

Let

[TABLE]

Assume that the pairs $(A,B)$ and $(A,C)$ are reachable and observable, respectively. Then, there exits $\phi_{N}$ , with $0<\phi_{N}<\tilde{\phi}_{N}$ and $N\geq n$ , such that if $0\leq\bar{\Phi}\leq\phi_{N}I_{nN}$ then $\Omega_{\bar{\Phi}}$ and $W_{\bar{\Phi}}$ are positive definite.

Proof.

It is not difficult to see that $Q_{\bar{\Phi}}$ is positive definite and $S_{\bar{\Phi}}$ negative definite for $0\leq\bar{\Phi}<\tilde{\phi}_{N}I_{nN}$ . The mapping $\bar{\Phi}\mapsto W_{\bar{\Phi}}$ is nondecreasing with respect to the partial order of symmetric matrices over $0<\bar{\Phi}<\tilde{\phi}_{n}I_{nN}$ because its first variation along a direction $\delta\bar{\Phi}\in\bar{\mathcal{Q}}_{+}^{Nn}$ is

[TABLE]

Note that, $W_{\bar{\Phi}=0}={\cal R}_{N}(I_{Nm}+{\cal H}_{N}^{T}({\cal D}_{N}{\cal D}_{N}^{T})^{-1}{\cal H}_{N})^{-1}{\cal R}_{N}^{T}$ which is positive definite for $N\geq n$ because the pair $(A,B)$ is reachable and thus $\mathcal{R}_{N}$ has full row rank. Accordingly, $W_{\bar{\Phi}}$ is positive definite for $0<\bar{\Phi}<\tilde{\phi}_{N}I_{nN}$ . The mapping $\bar{\Phi}\mapsto\Omega_{\bar{\Phi}}$ is nonincreasing for $0<\bar{\Phi}<\tilde{\phi}_{N}I_{nN}$ because its first variation along $\delta\bar{\Phi}\in\bar{\mathcal{Q}}_{+}^{Nn}$ is

[TABLE]

Moreover, $\Omega_{\bar{\Phi}=0}=\Omega_{N}$ which is positive definite for $N\geq n$ because the pair $(A,C)$ is observable. Accordingly, there exists a constant $\phi_{N}$ such that $0<\phi_{N}<\tilde{\phi}_{N}$ and both $W_{\bar{\Phi}}$ and $\Omega_{\bar{\Phi}}$ are positive definite for $0<\bar{\Phi}\leq\phi_{N}I_{Nn}$ .

Remark 3.1.

By the proof of Proposition 3.1, one can see that $\phi_{N}$ can be computed as follows: set $\phi_{N}=\tilde{\phi}_{N}$ and check whether $\Omega_{\phi_{N}I_{Nn}}$ is positive definite or not. If not, we decrease $\phi_{N}$ until $\Omega_{\phi_{N}I_{Nn}}$ becomes positive definite.

By Lemma 2.1, the mapping $r_{\tau,c,k}^{d}(\cdot)$ is strictly contractive provided that the matrices $\Omega_{\bar{\Phi}_{N,k}}$ and $W_{\bar{\Phi}_{N,k}}$ are positive definite. In view of Proposition 3.1, if for some fixed $\tilde{q}>0$ the following condition holds

[TABLE]

then the $N$ -fold composition $r^{d}_{\tau,c,k}(\cdot)$ is strictly contractive for $k\geq\tilde{q}$ and thus $r_{\tau,c}(\cdot)$ is strictly contractive as well.

4 Characterization of the range of the tolerance

In this Section, we characterize a range of $c$ for which condition (64) holds. The proofs of this Section only consider the case $0<\tau<1$ because the results for the case $\tau=1$ can be proved along similar lines, and the case $\tau=0$ has been already proved in [29]. Condition (64) is equivalent to the condition

[TABLE]

for some $q>0$ fixed. Through the next two Lemmas we will be able to derive a condition on $\theta_{k}$ which implies condition (65).

Lemma 4.1.

Let $\bar{P}_{k+1}=r(\bar{P}_{k})$ , with $P_{0}=BB^{T}$ , be the convergent iteration generated by the usual Riccati mapping

[TABLE]

Consider the sequence generated by (18). Then, $P_{k}\geq\bar{P}_{q}$ , with $k\geq q+1$ for any $q\geq 0$ .

Proof.

It is well known that the sequence $\{\bar{P}_{k}\}$ is nondecreasing with respect to the partial order of the symmetric matrices. Accordingly, it is sufficient to prove that $P_{k+1}\geq\bar{P}_{k},\;\;k\geq 0$ . For this aim, we define the risk sensitive Riccati mapping, [21],

[TABLE]

where $\Phi$ is a positive semidefinite matrix. For $k=0$ , we have $P_{1}=r_{\tau,c}(P_{0})\geq BB^{T}=\bar{P}_{0}$ . Assume that $P_{k}\geq\bar{P}_{k-1}$ , then

[TABLE]

where we exploited the fact that $r^{RS}_{\Phi}(P)\geq r(P)$ for any $\Phi$ positive semidefinite and $P$ such that $0<P<\Phi^{-1}$ , [21], and the fact that $r(\cdot)$ is a nondecreasing mapping with respect to the partial order of the symmetric matrices.

Lemma 4.2.

Let $\bar{d}$ be such that $P_{k+1}\geq\bar{d}I_{n}>0$ , then

[TABLE]

Proof.

Consider the function

[TABLE]

defined over the set $\mathcal{S}=\{\bar{d}\hbox{ s.t }0<\bar{d}<(\theta(1-\tau))^{-1}\}$ and $\theta>0$ . Then,

[TABLE]

where

[TABLE]

It is not difficult to see that

[TABLE]

which is nonpositive for $\bar{d}\in\mathcal{S}$ . Accordingly, $g$ is a nonincreasing function over $\mathcal{S}$ and

[TABLE]

Accordingly, the first derivative of $f_{\theta}$ in (69) is nonpositive over $\mathcal{S}$ , i.e. $f_{\theta}$ is nonincreasing over $\mathcal{S}$ .

Let $L_{P_{k+1}}=\tilde{U}_{k+1}D_{k+1}^{\frac{1}{2}}U_{k+1}^{T}$ be the singular value decomposition of $L_{P_{k+1}}$ , hence $\tilde{U}_{k+1}\tilde{U}_{k+1}^{T}=I_{n}$ , $U_{k+1}U_{k+1}^{T}=I_{n}$ and $D_{k+1}^{\frac{1}{2}}=\mathrm{diag}(\ldots\,d_{i,k+1}^{\frac{1}{2}}\,\ldots)$ positive definite. Therefore, we have

[TABLE]

Since the singular value decomposition of $P_{k+1}$ is $P_{k+1}=\tilde{U}_{k+1}\mathrm{diag}\left(\ldots,d_{i,k+1},\ldots\right)\tilde{U}_{k+1}^{T}$ , we have

[TABLE]

By assumption, $\bar{d}\leq d_{i,k+1}$ , $i=1\ldots n$ , therefore we have $f_{\theta_{k}}(d_{i,k+1})\leq f_{\theta_{k}}(\bar{d})$ , $i=1\ldots n$ . Accordingly, $\Phi_{k}\leq\tilde{U}_{k+1}\mathrm{diag}(\ldots,f_{\theta_{k}}(\bar{d}),\ldots)\tilde{U}_{k+1}^{T}=f_{\theta_{k}}(\bar{d})I_{n}$ which concludes the proof.

Fixed $q>0$ , by Lemma 4.1, for the sequence generated by (18) we have $P_{k}\geq\bar{P}_{q}\geq\sigma_{n}(\bar{P}_{q})I_{n}$ , $\forall\,k\geq q+1$ , and by Lemma 4.2 we have

[TABLE]

Therefore, the condition

[TABLE]

or equivalently

[TABLE]

implies (65). In particular, for $\tau=1$ we obtain

[TABLE]

The next Lemma is needed to derive a condition on $c$ which implies condition (70), and thus also condition (65).

Lemma 4.3.

Assuming that $0<\theta<((1-\tau)\|P\|)^{-1}$ , the following facts hold:

$\gamma_{\tau}(P,\cdot)$ * is monotone increasing over $\mathbb{R}_{+}$ * 2. 2.

If $P\geq Q$ then $\gamma_{\tau}(P,\theta)\geq\gamma_{\tau}(Q,\theta)$ 3. 3.

$\gamma_{\tau}(P,\theta)>0$ * for any $P\in\bar{\mathcal{Q}}_{n}^{+}$ with $P\neq 0$ .*

Proof.

The statement has been proved in [27].
First, note that

[TABLE]

To prove the statement, we show that the first variation of $\gamma_{\tau}(P,\theta)$ with respect to $P$ in any direction $Q\in\bar{\mathcal{Q}}_{n}^{+}$ is nonnegative:

[TABLE]

where we exploited the fact that $(I-\theta(1-\tau)P)^{\frac{2-\tau}{\tau-1}}$ and $P$ commutes.

$\gamma_{\tau}(P,\theta)$ is equal to the $\beta$ -divergence between the covariance matrices $(I_{n}-\theta(1-\tau)P)^{\frac{1}{\tau-1}}$ and $I_{n}$ , [23]. Since $(I_{n}-\theta(1-\tau)P)^{\frac{1}{\tau-1}}\neq I_{n}$ , we get $\gamma_{\tau}(P,\theta)>0$ .

We know that $P_{k+1}\geq\bar{P}_{q}$ $\forall\,k\geq q$ , which is equivalent to say $P_{k}\geq\bar{P}_{q}$ $\forall\,k\geq q+1$ . Then, by Lemma 4.3, condition $\gamma_{\tau}(P_{k+1},\theta_{k})=\gamma_{\tau}(\bar{P}_{q},\bar{\theta})$ implies that

[TABLE]

Figure 1 shows this situation. Thus, (65) holds if we choose $c$ in a such way that $\bar{\theta}\leq\frac{1-(1-\sigma_{n}(\bar{P}_{q})\phi_{N})^{1-\tau}}{(1-\tau)\sigma_{n}(\bar{P}_{q})}$ .

Theorem 4.1.

Let model (3) be such that $(A,B)$ and $(A,C)$ are reachable and observable, respectively. Let $c$ be such that $0<c\leq c_{MAX}$ with

[TABLE]

$N\geq n$ * and $q>0$ are fixed. Then, for any $V_{0}\in\mathcal{Q}_{+}^{n}$ , the sequence $P_{k}$ generated by iteration (18) converges to a unique solution $P$ . Moreover, the limit $G$ of the filtering gain $G_{k}$ as $k\rightarrow\infty$ has the property that $A-GC$ is stable.*

Proof.

Since

[TABLE]

by Lemma 4.3 we have that (70) holds for $k\geq q$ and therefore $\bar{\Phi}_{N,k}\leq\phi_{N}I_{nN}$ for $k\geq\tilde{q}=\lceil\frac{q}{N}\rceil$ . Accordingly, the mapping $r^{d}_{\tau,c,k}(\cdot)$ is strictly contractive for $k\geq\tilde{q}$ . Since $r^{d}_{\tau,c,k}(\cdot)$ is the $N$ -fold composition of $r_{c,\tau}(\cdot)$ , it follows that the sequence $P_{k}$ generated by (18) converges. By (12) the convergence of $P_{k}$ implies the convergence of $\theta_{k}$ to a unique value $\theta$ . Thus, (11) implies the convergence of $V_{k}$ to a unique solution $V$ . Finally, the stability of $A-GC$ can be proved by applying the Lyapunov stability theory to the algebraic Riccati-like equation

[TABLE]

Finally, it is not difficult to show that the mapping

[TABLE]

is nondecreasing. Thus, we have to choose $q$ sufficiently large in order to find a bigger $c_{MAX}$ .

5 Example

We consider the constant state space model (3) used in [18],

[TABLE]

The error covariance matrix at time $k=0$ is chosen as $V_{0}=I_{2}$ . We study the convergence of filter (6)-(17) with three different values for $\tau$ : $\tau=0$ , $\tau=0.5$ and $\tau=1$ . Fixing $q=40$ , $N=50$ we found that

[TABLE]

Moreover, the robust Kalman filter (6)-(17) converges with tolerance in the range $[0,c_{MAX}]$ where

[TABLE]

Now, we compare the performances of the following three filters:

•

KF: the standard Kalman filter

•

RKF0: the robust Kalman filter with $\tau=0$ and $c=1.22\cdot 10^{-1}$

•

RKF05: the robust Kalman filter with $\tau=0.5$ and $c=1.01\cdot 10^{-1}$

•

RKF1: the robust Kalman filter with $\tau=1$ and $c=8.62\cdot 10^{-2}$

that is we consider the robust Kalman filter with $\tau=0$ , $\tau=0.5$ , $\tau=1$ with the corresponding maximum tolerance for which we know that it converges. In Figure 2 we show the pseudo-nominal variance of the state estimation error of the first component of the state, that is the entry of $P_{k}$ in position (1,1).

In Figure 3 we show the pseudo-nominal variance of the state estimation error of the second component of $x_{k}$ , that is the entry in position (2,2) of $P_{k}$ .

Roughly speaking these quantities represent the error variance computed using the nominal density $f_{k}$ but propagating the previous least favorable density $\tilde{f}_{k-1}$ . The previous figures show that the Riccati-like iteration converges after 20 steps for $\tau=0$ , $\tau=0.5$ and $\tau=1$ . In Figure 4, we show the time-varying risk-sensitivity parameter $\theta_{k}$ which after 20 steps is already constant.

In Figure 5

and Figure 6 we consider the corresponding least-favorable error variance, i.e. the error variance is computed by using the least-favorable density $\tilde{f}_{k}$ and propagating the previous least favorable density $\tilde{f}_{k-1}$ .

It is clear that RKF0, RKF05 and RKF1 are very conservative with respect to the KF, i.e. their error variances are larger than the ones given by KF. This means that, although the upper bound $c_{MAX}$ we found is not tight, the range $[0,c_{MAX}]$ contains a sufficiently large class of robust estimators. In other words, with c close to zero we have robust Kalman filters with performance similar to KF, while with $c$ close to $c_{MAX}$ we have robust Kalman filters very different than KF.

6 Convergence analysis of the $\tau$ -risk sensitive filters

Consider the state-space model (3) and the corresponding nominal joint Gaussian probability density ${f}_{k}(x_{k+1},y_{k}|Y_{k-1})$ . The family of risk sensitive filters [28] parametrized by $\tau\in[0,1]$ is given by

[TABLE]

where $\tilde{f}_{k}$ is Gaussian, $\mathcal{B}_{k,\tau}=\{\tilde{f}_{k}\hbox{ s.t. }\mathbb{D}_{\tau}(\tilde{f}_{k}\|f_{k})<\infty\}$ and $\mathcal{G}_{k}$ is the set of estimators for which the objective function in (85) is finite. $\theta>0$ is the risk sensitivity parameter. The second term in the objective function in (85) is always nonpositive because $\mathbb{D}_{\tau}(\tilde{f}_{k}\|f_{k})\geq 0$ . Therefore, for large values of $\theta$ the maximizer has the possibility to take a probability density far from the nominal one. The $\tau$ -risk sensitive filter (85) thus represents a relaxed version of the robust Kalman filter (6)-(17) where $\theta$ now is constant and fixed by the user. For the case $\tau=0$ we obtain the usual risk sensitive filter [3]. The resulting estimator obeys the recursion (6)-(8) with

[TABLE]

The study of the asymptotic behavior of the $\tau$ -risk sensitive filter requires to consider two different cases: the case $0<\tau<1$ and the case $\tau=1$ .

In the former case, the Riccati-like iteration has the same form of (18) but the image of $\mathcal{Q}_{+}^{n}$ under this mapping is not entirely contained in $\mathcal{Q}_{+}^{n}$ . The reason is that condition $V_{k}>0$ holds only if $P_{k}$ is such that $0<P_{k}<(\theta(1-\tau))^{-1}I_{n}$ and this condition could be not satisfied. Following similar arguments used in [18] for the case $\tau=0$ , it is possible to find conditions on $V_{0}$ and $\theta$ for which the trajectory of iteration (18) satisfies $V_{k}>0$ for any $k>0$ . However, these conditions on $V_{0}$ and $\theta$ are rather intricate and require to design a gain matrix and a scaling factor $\rho^{2}$ .

For the case $\tau=1$ , $V_{k}$ is positive definite, and thus well defined, whenever $P_{k}$ is positive definite. Accordingly, the image of $\mathcal{Q}_{+}^{n}$ under the corresponding mapping, denoted by $r_{\tau=1,\theta}(\cdot)$ , is $\mathcal{Q}_{+}^{n}$ . Thus, the convergence of the iteration is guaranteed by only imposing conditions on the risk sensitivity parameter $\theta$ .

Theorem 6.1.

Let model (3) be such that $(A,B)$ and $(A,C)$ are reachable and observable, respectively. Let $\theta$ be such that

[TABLE]

$N\geq n$ * and $q>0$ are fixed. Then, for any $V_{0}\in\mathcal{Q}_{+}^{n}$ , the sequence $P_{k}$ generated by the risk-sensitive filter with $\tau=1$ converges to a unique solution $P$ . Moreover, the limit $G$ of the filtering gain $G_{k}$ as $k\rightarrow\infty$ has the property that $A-GC$ is stable.*

Proof.

We consider the downsampled process $x_{k}^{d}$ with $x_{k}^{d}=x_{kN}$ and the corresponding time-varying Riccati-iteration is $P_{k+1}^{d}=r^{d}_{\tau,\theta,k}(P_{k}^{d})$ where $r^{d}_{\tau,\theta,k}(\cdot)$ has the same structure of (60). Let $\Phi_{k}=P_{k+1}^{-1}-V_{k+1}^{-1}$ . Proposition 3.1 still holds. In particular, there exists $\phi_{N}$ such that if (65) holds then the matrices $\Omega_{\bar{\Phi}_{N,k}}$ and $W_{\bar{\Phi}_{N,k}}$ are positive definite. Accordingly, by Lemma 2.1 the $N$ -fold mapping $r^{d}_{\tau,\theta,k}(\cdot)$ is strictly contractive and thus also $r_{\tau=1,\theta}(\cdot)$ is strictly contractive. Lemma 4.1 and Lemma 4.2 still hold, in particular

[TABLE]

Finally, by imposing

[TABLE]

which coincides with (89), then condition (65) holds. Thus, the sequence $P_{k}$ converges to a unique $P$ as $k\rightarrow\infty$ . The stability of $A-GC$ follows as before.

It is clear that condition (89) on the risk sensitivity parameter is easy to check. Accordingly, this filter is preferable than the risk-sensitive filter with $0\leq\tau<1$ .

7 Conclusions

A convergence analysis of a family of robust Kalman filters has been presented. This analysis exploited the fact that the $N$ -fold Riccati mapping, which is given by downsampling these filters, is strictly contractive provided that the time-varying risk-sensitive parameter is sufficiently small. This condition is then guaranteed by placing an upper bound on the tolerance parameter of the robust filters. Finally, we have studied the convergence property of a family of risk-sensitive filters which can be understood as a relaxed version of the previous robust Kalman filters.

Bibliography29

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] J. Aubin and I. Ekeland , Applied Nonlinear Analysis , J. Wiley, New York, 1984.
2[2] R. Bhatia , On the exponential metric increasing property , Linear Algebra and its Appl., 375 (2003), pp. 211–220.
3[3] R. Boel, M. James, and I. Petersen , Robustness and risk-sensitive filtering , IEEE Trans. Automat. Control, 47 (2002), pp. 451–461.
4[4] P. Bougerol , Kalman filtering with random coefficients and contractions , SIAM J. Control and Optimiz., 31 (1993), pp. 942–959.
5[5] A. Ferrante and B. Levy , Hermitian solutions of the equation x = q + n x − 1 n ∗ 𝑥 𝑞 𝑛 superscript 𝑥 1 superscript 𝑛 x=q+nx^{-1}n^{*} , Linear Algebra and its Applications, 247 (1996), pp. 359–373.
6[6] A. Ferrante and L. Ntogramatzidis , The generalised discrete algebraic riccati equation in linear-quadratic optimal control , Automatica, 49 (2013), pp. 471–478.
7[7] A. Ferrante and L. Ntogramatzidis , The generalized continuous algebraic riccati equation and impulse-free continuous-time LQ optimal control , Automatica, 50 (2014), pp. 1176–1180.
8[8] S. Gaubert and Z. Qu , The contraction rate in thompson’s part metric of order-preserving flows on a cone - application to generalized riccati equations , Journal of Differential Equations, 256 (2014), pp. 2902–2948.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Convergence analysis of a family of robust Kalman filters based on the contraction principle††thanks: This work has been partially supported by the FIRB project “Learning

Abstract

keywords:

AMS:

1 Introduction

2 Thompson’s part metric and contraction mappings

Lemma 2.1**.**

3 Contraction property of the robust Kalman filters

Proposition 3.1**.**

Proof.

Remark 3.1**.**

4 Characterization of the range of the tolerance

Lemma 4.1**.**

Proof.

Lemma 4.2**.**

Proof.

Lemma 4.3**.**

Proof.

Theorem 4.1**.**

Proof.

5 Example

6 Convergence analysis of the τ\tauτ-risk sensitive filters

Theorem 6.1**.**

Proof.

7 Conclusions

Lemma 2.1.

Proposition 3.1.

Remark 3.1.

Lemma 4.1.

Lemma 4.2.

Lemma 4.3.

Theorem 4.1.

6 Convergence analysis of the $\tau$ -risk sensitive filters

Theorem 6.1.