Diffusion map-based algorithm for Gain function approximation in the Feedback Particle Filter
Amirhossein Taghvaei, Prashant G. Mehta, Sean P. Meyn

TL;DR
This paper presents a rigorous error analysis of a diffusion map-based algorithm for approximating the gain function in the Feedback Particle Filter, addressing bias and variance components with numerical validation.
Contribution
The paper provides the first rigorous error bounds for the diffusion map-based gain function approximation in FPF, including bias and variance analysis.
Findings
Bias and variance bounds derived for the algorithm
Numerical experiments illustrate effects of dimension and sample size
Algorithm applied successfully to filtering examples and compared with SIR filter
Abstract
Feedback particle filter (FPF) is a numerical algorithm to approximate the solution of the nonlinear filtering problem in continuous-time settings. In any numerical implementation of the FPF algorithm, the main challenge is to numerically approximate the so-called gain function. A numerical algorithm for gain function approximation is the subject of this paper. The exact gain function is the solution of a Poisson equation involving a probability-weighted Laplacian . The numerical problem is to approximate this solution using {\em only} finitely many particles sampled from the probability distribution . A diffusion map-based algorithm was proposed by the authors in a prior work to solve this problem. The algorithm is named as such because it involves, as an intermediate step, a diffusion map approximation of the exact semigroup . The original…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
\newsiamremark
remarkRemark \newsiamremarkhypothesisHypothesis
\newsiamthmclaimClaim
\headersGain function approximation in the FPFA.Taghvaei, P. G. Mehta, and S. P. Meyn
Diffusion map-based algorithm for Gain function approximation in the Feedback Particle Filter††thanks: Financial support from the NSF CMMI grants 1334987 and 1462773 is gratefully acknowledged.
Amirhossein Taghvaei Department of Mechanical Science and Engineering, University of Illinois at Urbana-Champaign, Urbana, IL (, ). [email protected]
Prashant G. Mehta22footnotemark: 2
Sean P. Meyn Department of Electrical and Computer Engineering, University of Florida, Gainesville, FL () [email protected]
Abstract
Feedback particle filter (FPF) is a numerical algorithm to approximate the solution of the nonlinear filtering problem in continuous-time settings. In any numerical implementation of the FPF algorithm, the main challenge is to numerically approximate the so-called gain function. A numerical algorithm for gain function approximation is the subject of this paper. The exact gain function is the solution of a Poisson equation involving a probability-weighted Laplacian . The numerical problem is to approximate this solution using only finitely many particles sampled from the probability distribution . A diffusion map-based algorithm was proposed by the authors in a prior work [60, 62] to solve this problem. The algorithm is named as such because it involves, as an intermediate step, a diffusion map approximation of the exact semigroup . The original contribution of this paper is to carry out a rigorous error analysis of the diffusion map-based algorithm. The error is shown to include two components: bias and variance. The bias results from the diffusion map approximation of the exact semigroup. The variance arises because of finite sample size. Scalings and upper bounds are derived for bias and variance. These bounds are then illustrated with numerical experiments that serve to emphasize the effects of problem dimension and sample size. The proposed algorithm is applied to two filtering examples and comparisons provided with the sequential importance resampling (SIR) particle filter.
keywords:
Stochastic Processes, Nonlinear filtering, Poisson equation
{AMS}
93E11, 65N75, 65N15
1 Introduction
This paper is concerned with a numerical solution of a certain linear partial differential equation (PDE) that arises in nonlinear filtering problem in continuous-time settings.
Nonlinear filtering problem: The standard model of the nonlinear filtering problem is given by the following stochastic differential equations (SDE) [67]:
[TABLE]
where is the (hidden) state at time , is the observation, and , are two mutually independent standard Wiener processes taking values in and , respectively. The mappings and are known functions, and is the density of the prior probability distribution.
The objective of the filtering problem is to compute the posterior distribution of the state given the time history of observations (filtration) .
The problem is linear Gaussian if , and are linear functions and is a Gaussian density. We use and to denote the matrices that define these linear functions, i.e, and . The background on the linear Gaussian problem, along with its solution given by the Kalman-Bucy filter [35], appears in [40].
Feedback particle filter (FPF) is a numerical algorithm to approximate the posterior distribution in nonlinear non-Gaussian settings [69, 68]. The FPF algorithm is an alternative to the sequential importance resampling (SIR) particle filters [30, 25, 3, 22]. The distinguishing feature of the FPF is that the importance sampling step is replaced with feedback control. Steps such as resampling, reproduction, death or birth of particles are altogether avoided. The particles in FPF have uniform importance weights by construction. Therefore, the FPF does not suffer from the particle degeneracy issue that is commonly observed in implementations of the SIR particle filters [25]. In independent numerical evaluations and comparisons, it has been observed that FPF exhibits smaller simulation variance and better scaling properties with the problem dimension [9, 55, 57].
The construction of FPF is based on the following two steps: {romannum}
Construct a stochastic process, denoted by , whose conditional distribution (given ) is equal to the conditional distribution of ;
Simulate stochastic processes, denoted by , to empirically approximate the distribution of .
[TABLE]
The process is referred to as mean-field process and the processes are referred to as particles. The construction ensures that the filter is exact in the mean-field () limit.
The details of the two steps are as follows:
Mean-field process: In the FPF, the mean-field process evolves according to the SDE given by
[TABLE]
where is a standard Wiener processes independent of and . The indicates that the sde is expressed in its Stratonovich form. The gain function is where is the solution of the Poisson equation:
[TABLE]
where and denote the gradient and the divergence operators, respectively, and denotes the conditional density of given . The operator on the left-hand side of the Poisson equation (3) is referred to as the probability-weighted Laplacian. It is denoted as where the probability density is the conditional density .
Particles: The particles evolve according to:
[TABLE]
for , where are mutually independent Wiener processes, , and is the output of an algorithm that approximates the solution to the Poisson equation Eq. 3
[TABLE]
The notation is suggestive of the fact that algorithm is adapted to the ensemble and the function ; the density is not known in an explicit manner.
Development and error analysis of one such gain function approximation algorithm is the subject of the present paper. Before describing the general case, it is useful to review the filter for the linear Gaussian case where the solution of the Poisson equation is explicitly known.
FPF for Linear Gaussian setting: Suppose and is a Gaussian density with mean and variance . Then the solution of the Poisson equation is known in an explicit form [68, Sec. D]. The resulting gain function is constant and equal to the Kalman gain:
[TABLE]
Therefore, the mean-field process Eq. 2 for the linear Gaussian problem is given by:
[TABLE]
Given the explicit form of the gain function Eq. 6, the empirical approximation of the gain is simply where is the empirical covariance of the particles. Therefore, the evolution of the particles is:
[TABLE]
for , where is the empirical mean of the particles. The empirical quantities are computed as:
[TABLE]
The linear Gaussian FPF Eq. 7 is identical to the square-root form of the ensemble Kalman filter (EnKF) [8, Eq. 3.3].
One extension of the Kalman gain is the so called constant gain approximation formula whereby the gain is approximated by its expected value (which represents the best least-squared approximation of the gain by a constant). Remarkably, the expected value admits a closed-form expression which is then readily approximated empirically using the particles (see Remark 2.3 for derivation):
[TABLE]
The constant gain approximation formula has been used in nonlinear extensions of the EnKF algorithm [21]. The connection to the Poisson equation provides a justification for this formula. The formula is attractive because it provides a consistent (as the number of particles ) approximation of the Kalman gain in the linear Gaussian setting.
Design and analysis of the gain function approximation algorithm (5) in the general case is a challenging problem because of two reasons: (i) Apart from the Gaussian case, there are no known closed-form solutions of Eq. 9; (ii) The density is not explicitly known. At each time-step, one only has samples . For the purpose of this paper, these samples are assumed to be i.i.d drawn from . The assumption is justified because in the limit of large , the particles are approximately i.i.d (by the propagation of chaos); cf., [58].
1.1 Contributions of this paper
The paper presents a diffusion map-based algorithm for the gain function approximation problem. The algorithm is named as such because it involves, as an intermediate step, a diffusion map approximation of the exact semigroup . The following is a summary of specific original contributions made in this paper:
- (i)
Error estimates that relate the exact semigroup to its diffusion map approximation. The error estimates are derived by employing a Feynman-Kac representation of the semigroup (Proposition 3.3); 2. (ii)
A uniform spectral gap for the diffusion map based on the use of the Foster-Lyapunov function method from the theory of stochastic stability of Markov processes (Proposition 4.2); and 3. (iii)
Error estimates for the empirical approximation of the diffusion map (Proposition 3.4).
The results from (i) and (ii) are used to derive estimates for the bias and to show that the bias converges to zero in a certain limit (Theorem 4.3). Results from (iii) are used to prove the convergence of the variance error term to zero in the infinite- limit (Theorem 4.4). The paper contains numerical experiments that serve to illustrate the effects of problem dimension and sample size. The algorithm is applied to two filtering examples and comparisons provided with the sequential importance resampling (SIR) particle filter.
1.2 Relationship to prior work
The gain function algorithm first appeared in the conference version of this paper [60]. Its preliminary error analysis was reported in the conference paper [62]. The important distinction is that the results in these conference papers were preliminary in nature. The proofs were either altogether omitted or based on formal arguments. The main techniques employed in this paper, namely, (i) the use of Feyman-Kac representation to quantify the error due to the diffusion map approximation of the exact semigroup, and (ii) the use of stochastic stability theory to derive uniform spectral gap for the diffusion map, are original and do not appear in the conference papers. These techniques are important to be able to obtain precise estimates as enumerated above in the list of contributions. Since the main technical tools are new, all the proofs, based on these techniques, are new and original contributions of this paper. The diffusion map was introduced in [15], in the context of spectral clustering [6, 65]. Results on its convergence analysis appears in [32, 53, 15, 28, 31, 66, 7]. The use of diffusion map approximations for filtering problems is originally due to the authors.
1.3 Literature survey
Apart from its direct relevance to numerical approximation of the FPF, there are three topics of current research interest that are relevant to the subject of this paper: (i) ensemble Kalman filter; (ii) particle flow algorithms for nonlinear filtering; and (iii) optimal transport. Specifically, the algorithms for gain function approximation described in this paper are also directly applicable to these other topics. These relationships are briefly discussed next:
Ensemble Kalman filter: The EnKF algorithm was first developed in the discrete-time setting [27]. In the continuous-time setting, two formulations of the EnKF have been developed: stochastic EnKF, and the more recent deterministic EnKF [8, 51]. As has already been noted, the deterministic EnKF is in fact identical to the FPF algorithm Eq. 7 in the linear Gaussian setting [8, 59].
The EnKF algorithm provides a consistent approximation in the linear Gaussian setting. Compared to the Kalman filter, the main utility of EnKF is that it does not require propagation of the covariance matrix. This reduces the computational complexity from for the Kalman filter to . This is clearly advantageous in high dimensional problems when . This property has made EnKF popular in applications such as weather prediction in high dimensional settings [36, 47]. The disadvantage of the EnKF algorithm, of course, is that it does not provide a consistent approximation for nonlinear problems.
FPF represents a the generalization of the EnKF to the nonlinear non-Gaussian setting [59]: With the constant gain approximation, the algorithms are identical. Given this parallel, the problem of improving the EnKF algorithm in more general nonlinear non-Gaussian settings is directly related to the problem of better approximating the gain function in the FPF. In an application software based on EnKF, it is a relatively simple matter to replace the constant gain formula for the gain by more sophisticated approximations described in this paper. Certain empirical evaluations on the performance of FPF in high-dimensional settings are reported in [57, 55, 54, 9].
Error analysis and stability of EnKF is an active area of research; see [43, 41, 24] for linear models and [21, 23, 37] for nonlinear models. The error analysis for the gain function approximation reported in this paper is a step towards error analysis of the FPF along these lines.
Particle flow algorithms: The following first-order (and hence an under determined) form of the Poisson equation appears in most types of particle flow algorithms:
[TABLE]
where the righthand-side (rhs) is given and defines a vector field that must be obtained to implement the particle flow. The PDE appears in the first interacting particle representation of the continuous-time filtering in [17, 18] and the discrete-time filtering in [19]. Stochastic extensions of these have also recently appeared in [20] where approximate solutions are also described based on Gaussian assumption on the density. The algorithm described here represent an approximation of a particular gradient form solution of the first-order PDE.
Optimal transport: The mean-field SDE Eq. 2 represents a transport that maps the prior distribution at time [math] to the posterior distribution at an (arbitrary) future time . Synthesis of optimal transport maps for implementing the Bayes formula appears in [50, 14, 26, 61, 33, 13]. The relationship with the Poisson equation is through the ensemble transform filter which relies on a linear programming construction to approximate the optimal transport map [14]. As discussed in [59, Sec. 5.5], the solution of the Poisson equation yields an infinitesimal optimal transport map from the “prior” to “posterior” . Another closely related approach is transportation through Gibbs flow [33].
Directly related to the FPF, the Galerkin method for the numerical solution of the Poisson equation appeared in original papers [68, 69]. The Galerkin algorithm represents the ‘direct” PDE approach to construct a numerical approximation. The constant gain approximation is a particular example of a Galerkin solution. In general, the main problem with the Galerkin approximation is that it requires a selection of basis functions. This becomes intractable in high dimensions. To mitigate this issue, a proper orthogonal decomposition (POD)-based procedure to select basis functions is introduced in [11]. Other existing approaches are a continuation scheme for approximation [44], a probabilistic approach based on dynamic programming [48], and a procedure based on expressing the gain function in a reproducible Hilbert kernel space [49]. A comparison of different gain function approximation methods appears in [10].
1.4 Paper outline
The outline of the remainder of this paper is as follows: The mathematical problem of the gain function approximation together with a summary of known results on this topic appears in Section 2. The diffusion-map based algorithm is described in a self-contained fashion in Section 3. The main theoretical results of this paper including the bias and variance estimates appear in Section 4. Some numerical experiments for the same appear in Section 5. All the proofs appear in the Appendix.
1.5 Notation
For vectors , the dot product is denoted as and . The space of positive definite matrices is denoted as . The Borel -algebra on is denoted by . The indicator function, for a measurable set , is denoted as . The space of measurable functions such that is denoted as . The inner product on is defined by \big{<}f,g\big{>}:=\int f(x)g(x)\rho(x)\,\mathrm{d}x. The space is the space functions whose derivative (defined in the weak sense) is in . For a (weakly) differentiable function , . For an integrable function , denotes the mean. and denote the co-dimension subspace of functions whose mean is zero. denotes the space of bounded functions on with the sup-norm denoted as . The space of continuous and bounded functions on and the space of continuous and smooth functions on is denoted as and respectively. For a linear operator , on a Banach space with norm , the operator norm is denoted as . The Gaussian distribution with mean and covariance is denoted as . The variance of the random variable is denoted as .
2 Gain function approximation
2.1 Problem formulation
The mathematical problem is to numerically approximate the solution of the Poisson’s equation Eq. 3 introduced in Section 1 and also repeated below:
[TABLE]
where the weighted Laplacian ; is an everywhere positive probability density on ; is a real-valued function defined on and . The function is referred to as the solution. Its gradient is referred to as the gain function and denoted as . The PDE Eq. 9 is referred to as the Poisson’s equation.
The numerical approximation problem is as follows:
Problem statement: Given samples , drawn i.i.d. from , approximate the gains , where . The density is not known in an explicit form.
2.2 Mathematical preliminaries
Assumptions: The following assumptions are made throughout the paper:
{romannum}
Assumption A1: The probability density is of the form where the function for some , , and ;
Assumption A2: The function is (weakly) differentiable with .
Remark 2.1**.**
*Assumption A1 is used to prove the approximation result (Proposition 3.3) and to derive the spectral gap (Proposition 4.2) for the diffusion map approximation first introduced in Section 3. In prior literature, a similar assumption has been previously used for studying functional inequalities to obtain Poincaré inequality with a constant that does not depend on the dimension [64, Ch. 8]. Assumption A1 is restrictive, e.g., a mixture of Gaussians does not satisfy the assumption. Based on numerical experiments, it is conjectured that Assumption A1 can be relaxed. A weaker assumption would be to assume , the convolution of a Gaussian density with a density that has a compact support. Proving the theoretical results under this weaker assumption is the subject of future work. *
2.2.1 Spectral representation
Under Assumption (A1), the weighted Laplacian has a discrete spectrum with an ordered sequence of eigenvalues and associated eigenfunctions that form a complete orthonormal basis of [5, Cor. 4.10.9]. The trivial eigenfunction , and for , the spectral representation yields:
[TABLE]
The positivity of the smallest non-trivial eigenvalue () is referred to as the Poincaré inequality (or the spectral gap condition) [4]. The inequality is equivalently expressed as
[TABLE]
where .
The Poincaré inequality is important to show that the Poisson equation is well-posed and a unique solution exists. The solution to the Poisson equation is defined using the weak formulation.
2.2.2 Weak formulation
A function is said to be a weak solution of Eq. 9 if
[TABLE]
Equation Eq. 11 is referred to as the weak-form of the Poisson’s equation. The weak-form is expressed succinctly as where is the inner-product in . The existence and uniqueness of the solution to the weak-form of the Poisson equation is stated in the following Proposition.
Proposition 2.2**.**
[42*, Thm. 2.2.]**
Suppose satisfies Assumption (A1) and satisfies Assumption (A2). Then there exists a unique function that satisfies the weak-form of the Poisson equation Eq. 11. The solution satisfies the bound:*
[TABLE]
Remark 2.3** (Constant gain approximation).**
The weak formulation Eq. 11 has led to the Galerkin algorithm presented in the original FPF papers [68]. A special case of the Galerkin solution is the constant gain approximation formula Eq. 8. The formula is obtained upon choosing the test functions in Eq. 11 to be the coordinate functions: for . Then,
[TABLE]
*which yields the formula Eq. 8.
The diffusion map-based algorithm presented in this paper is based on the semigroup formulation of the Poisson equation.
2.2.3 Semigroup
Let be the semigroup associated with the weighted Laplacian . The semigroup allows for a probabilistic interpretation which is described next. Consider the following reversible Markov process evolving in :
[TABLE]
where and is a standard Weiner process in . Then
[TABLE]
It is straightforward to verify that is symmetric, i.e., for all and is its invariant density. The semigroup also admits a kernel representation:
[TABLE]
where .
The spectral gap implies that . Hence, is a strict contraction on . For the special case of Gaussian density, the eigenfunctions are given by the Hermite polynomials. This leads to an explicit formula for the kernel in the Gaussian case, as described in Appendix A.
Consider the heat equation
[TABLE]
Its solution is given in terms of the semigroup as follows:
[TABLE]
Letting where solves the Poisson equation Eq. 9 yields the following fixed-point equation for :
[TABLE]
Equation Eq. 12 is referred to as the semigroup form of the Poisson equation Eq. 9.
The following Proposition shows that the weak form Eq. 11 and the semigroup form Eq. 12 are equivalent. The proof appears in the Appendix B.
Proposition 2.4**.**
*Suppose satisfies Assumption (A1) and satisfies Assumption (A2). Then the unique solution to the weak form Eq. 11 is also the unique solution to the fixed-point equation Eq. 12. *
The semigroup formulation has led to the diffusion-map based algorithm which is the main focus of the remainder of this paper.
3 Diffusion map-based Algorithm
The diffusion map-based algorithm is based on a numerical approximation of the fixed-point equation Eq. 12. The main technique is to approximate the semigroup in the following three steps:
Diffusion map approximation: A family of Markov operators are defined as follows:
[TABLE]
where is the normalization factor,
[TABLE]
and is the Gaussian kernel in . For small positive values of , the Markov operator is referred to as the diffusion map approximation of the exact semigroup [15, 32]. The precise statement of this approximation is contained in Proposition 3.3. For the special case of Gaussian density, an explicit formula for the diffusion map appears in the Appendix A. 2. 2.
Empirical approximation: The operator is approximated empirically by defined as follows:
[TABLE]
where is the normalization factor and
[TABLE]
Recall that for . So, by law of large numbers (LLN), represents an empirical approximation of the diffusion map . The precise statement of the empirical approximation is contained in Proposition 3.4. 3. 3.
Approximation as Markov matrix: An Markov matrix is defined with -th element given by
[TABLE]
Finite-dimensional fixed-point equation: Using the three steps above, the original infinite-dimensional fixed-point equation Eq. 12 is approximated as a finite dimensional fixed-point equation
[TABLE]
where is a column vector, and where the probability vector is the unique stationary distribution of the Markov matrix . The solution is used to define an approximation to the solution of the Poisson equation as follows:
[TABLE]
The approximation for the gain function is as follows:
[TABLE]
Upon evaluating the gradient in closed-form, the following linear formula results for the gain function evaluated at particle locations:
[TABLE]
where
[TABLE]
The details of the calculation leading to the linear formula appear in the Appendix C.
Remark 3.1** (Numerical procedure).**
The fixed-point problem (16) is solved in an iterative manner. The vector is initialized to and updated according to
[TABLE]
for for a finite number of iterations. The procedure is guaranteed to converge, with a geometric convergence rate, because is a strict contraction on (Proposition 4.1-(ii)). The overall algorithm is presented in Algorithm 1.
*The proposed iterative procedure (21) is preferred to other numerical procedures because (i) it is straightforward to implement and does not require matrix inversion; (ii) it may be numerically more efficient than solving a system of linear equations; and (iii) it allows one to use the solution obtained from the previous filter step, as initialization for the iterative procedure (21), resulting in quick convergence – typically in a few iterations. The reason for quick convergence is that the change in the solution of the fixed point equation (16) is (typically) small from one filtering step to the next. This is because the change in particle locations is (typically) small for a small choice of time increment. *
Remark 3.2**.**
The computational complexity of the diffusion-map based algorithm is because of the need to assemble the matrix . The computational complexity may be reduced using the sparsity structure of the matrix and sub-sampling techniques. Compared to the Galerkin algorithm with computational complexity of , the diffusion-map algorithm is advantageous in high-dimensional problems where .
3.1 Approximation results
The notation is used to denote the heat semigroup with a Gaussian kernel , and
[TABLE]
The proof of the following proposition appears in Appendix E.
Proposition 3.3**.**
Consider the family of Markov operators defined according to Eq. 13. Let , with , and . Then, {romannum}
The semigroup and the operator admit the following representations:
[TABLE]
for all where is the Brownian motion with initial condition .
In the asymptotic limit as :
[TABLE]
where and as .
For all functions such that :
[TABLE]
*where the constant only depends on and .
The proof of the following proposition appears in Appendix H.
Proposition 3.4**.**
Consider the diffusion map kernel , and its empirical approximation . Then for any bounded continuous function : {romannum}
(Almost sure convergence) For all
[TABLE]
(Convergence rate) For any , in the asymptotic limit as ,
[TABLE]
*with probability higher than .
Remark 3.5** (Related work).**
*The key idea in the proof of the Proposition 3.3 is the Feynman-Kac representation of the semigroup Eq. 23. To the best of our knowledge, this representation has not been used before in the analysis of the diffusion map approximation. Most of the existing results concerning the convergence of the diffusion map are based on a Taylor series expansion that would lead to a convergence of the form for each [32, 15, 28]. Convergence results of the form appear in [15, 63], based on functional analytic arguments. The Taylor series type arguments typically require the distribution to be supported on a compact manifold which not assumed here. *
4 Convergence and error analysis
The analysis of the diffusion-map algorithm involves the consideration of the following four fixed point problems:
[TABLE]
where and is the density of the invariant probability distribution associated with the Markov operator .
In practice, the finite-dimensional problem Eq. 31 is solved. The existence and uniqueness of the solution for this problem is the subject of the following proposition whose proof appears in Appendix D.
Proposition 4.1**.**
*Consider the finite-dimensional fixed point equation Eq. 31.
Then almost surely {romannum}*
* is a reversible Markov matrix with a unique stationary distribution*
[TABLE]
for .
* is a strict contraction on . Hence the fixed point equation Eq. 31 has a unique solution .*
The (empirical approx.) fixed point equation Eq. 30 has a unique solution given by (see Eq. 17)
[TABLE]
Based on the results in Proposition 2.4 and Proposition 4.1, the exact solution and the numerical solution are both well-defined. The remaining task is to show the convergence of as and . We break the convergence analysis into two parts, bias and variance:
[TABLE]
Before describing the general result, it is useful to first introduce an example that helps illustrate the bias-variance trade-off in this problem.
4.1 Example - the scalar case
In the scalar case (where ), the Poisson equation is:
[TABLE]
Integrating twice yields the solution explicitly
[TABLE]
For the choice of as the sum of two Gaussians and with and , the solution obtained using Eq. 33 is depicted in Fig. 1 (a). Also depicted is the approximate solution obtained using the diffusion-map algorithm with , for different values of . The constant gain approximation is evaluated according to the explicit integral formula (8). As the approximate gain converges to the constant gain approximation. As becomes smaller, the approximation becomes more accurate. However, for very small values of the approximation is poor due to the variance error.
The bias-variance trade-off while varying the the parameter is depicted in Fig. 1 (b). The error is computed as a Monte-Carlo average:
[TABLE]
Fig. 1 (b) depicts the error obtained from averaging over simulations as a function of the parameter . It is observed that for a fixed number of particles , there is an optimal value of that minimizes the error.
The vector counterpart of this example appears in Section 5.1.
4.2 Bias
The analysis of bias has two parts:
To show that the (diffusion-map) fixed-point equation Eq. 29 admits a unique solution for all positive choices of ; 2. 2.
To show that as .
For , iterate the fixed-point equation Eq. 29 times to obtain:
[TABLE]
We let for some and study the solution of this fixed-point equation as . Note that the solution to the iterated fixed-point equation (35) is identical to the solution to the fixed-point equation Eq. 29.
The fixed-point equation Eq. 35 is the (discrete) Poisson equation that appears in the theory of Markov chain simulation [29, 46] and stochastic control [45, Ch. 9]. Theory presented in these references illustrates how bounds on the solution are obtained under a Foster-Lyapunov drift condition. A similar strategy is adopted here.
In the following proposition, an existence-uniqueness result is described for the fixed-point equation Eq. 35. The technical step in the proof involves a Foster-Lyapunov condition known as DV(3) [39]. The proof appears in Appendix F.
Proposition 4.2**.**
Consider the family of Markov operators defined in Eq. 13. Let , , and , with . Then there exists positive constants , , , , a probability measure , and a number such that for all :
[TABLE]
Consequently, {romannum}
The chain with transition kernel is geometrically ergodic with invariant density
[TABLE]
* is reversible with respect to the density It admits a spectral gap as a linear operator that is uniform with respect to . The spectral gap is denoted as .*
There exists a solution to Eq. 35 with the bound
[TABLE]
The proof of the following main result appears in Appendix G.
Theorem 4.3**.**
Suppose the assumptions (A1)-(A2) hold for the density and the function , and denotes the exact solution of Eq. 28. Consider the approximation of this problem defined by the (diffusion-map) fixed-point equation Eq. 29. For the approximate problem: {romannum}
Existence-Uniqueness: For each fixed , there exists a unique solution .
Convergence: In the asymptotic limit as
[TABLE]
4.3 Variance
The analysis of the variance concerns the (empirical) fixed-point equation Eq. 30 whose solution is denoted as . The parameter is assumed to be positive and fixed and is assumed to be finite but large.
The existence-uniqueness of has already been shown as part of Prop. 4.1. The convergence has only been shown below only for the case where the density has a compact support.
Assumption A3: The distribution has compact support given by .
Theorem 4.4**.**
Suppose the assumptions (A2)-(A3) hold for the density and the function , and denotes the solution of the (kernel) fixed-point equation Eq. 29 for a fixed positive parameter . Consider the approximation of this problem defined by the (empirical) fixed-point equation Eq. 30. For the approximate problem: {romannum}
Existence-Uniqueness: For each finite , there exists (almost surely) a unique solution .
Convergence: The approximate solution converges to the kernel solution
[TABLE]
The proof of the convergence is based on classical results in the numerical analysis of integral equations on a grid [1, 2]. It relies on the verification of the following three conditions: {romannum}
The family of operators is collectively compact as linear operators on .
For any function ,
[TABLE]
The inverse exists and it is a bounded on .
Once these three conditions have been verified, the convergence result Eq. 39 follows from a standard result in the approximation theory of the numerical solutions of integral equations [34, Thm. 7.6.6]. The proof appears in Appendix I.
Remark 4.5** (Convergence rate).**
The result in Theorem 4.4 establishes asymptotic convergence of the variance error to zero. However, it does not provide an explicit form for the convergence rate. It is possible to obtain an explicit form based upon a convergence rate estimate for the uniform convergence (40). The latter is difficult because the existing result in [28] holds only under rather strong regularity conditions on and assumes that the distribution is uniform.
Based upon the approximation result Proposition 3.4, suppose a convergence rate holds for (40) with order . In this case, it is straightforward to derive the following explicit form of the convergence rate for the variance:
[TABLE]
*The validity and tightness of this bound is studied using numerical experiments in Section 5. *
Remark 4.6**.**
*(Unbounded domain) Analysis of the variance error for the case where the support of is unbounded has proved to be difficult. In the unbounded case, it is more appropriate to consider and as linear operators on . Following the same approach as used in the proof of Theorem 4.4, one would need to verify the three conditions noted above. However, for the unbounded case, we could not verify the condition (i) that is collectively compact on . An alternative approach is to follow the spectral method as outlined in [38]. In this approach, one examines the convergence of empirical matrix where is a given symmetric kernel. However, this approach does not directly apply to the analysis of the empirical operator . This is because the form of the kernel , as it is used in the definition of , is not explicitly given. It too must be empirically approximated as a ratio whose convergence analysis has proved to be rather challenging. *
4.4 Relationship to the constant gain approximation
Although the convergence and error analysis pertains to the limit, an important property of the diffusion-map approximation is that the numerical procedure yields a unique solution for arbitrary values of (see Proposition 4.1). In fact, more can be said: one recovers the constant gain approximation formula in the limit.
Before stating the result, it is useful to recall the three formulae for the gain: {romannum}
Exact formula: is defined using the exact solution ;
Kernel formula: is defined using the solution to the (diffusion-map) approximation fixed-point equation:
[TABLE]
Empirical formula: is the empirical version of the kernel formula. It was defined in Eq. 18 using the solution of the finite-dimensional fixed-point problem.
The proof of the following Proposition appears in the Appendix J.
Proposition 4.7**.**
Consider the fixed-point problems Eq. 29 and Eq. 30 in the limit as . {romannum}
The kernel formula of the gain is given by
[TABLE]
For any finite , the empirical formula of the gain is given by
[TABLE]
This result serves to highlight the connection between the FPF and the EnKF: With the diffusion map approximation of the gain, the FPF approaches EnKF in the limit of large . The parameter can then be regarded as the tuning parameter to “improve” the gain. Of course, for any finite value of , this can only be done up to a point – where variance becomes dominant (see Fig. 1).
5 Numerics
5.1 Example - the vector case
A vector generalization of the scalar example in Section 4.1 is obtained by considering the following form of the probability density function in -dimensions:
[TABLE]
where is the bimodal distribution introduced in Section 4.1, and is the Gaussian distribution . Also suppose the function . The simple example is illustrative of realistic application scenarios where the density has non-Gaussian features along certain (not necessarily apriori known) low-dimensional subspace. The directions orthogonal to this subspace are modelled here as Gaussian noise.
For this problem, the exact gain function is easily obtained as
[TABLE]
where the function is given by the formula Eq. 33 in Section 4.1. The exact solution is used to compute error properties as dimension increases.
The diffusion-map algorithm (Algorithm 1) is simulated to approximate the gain function for this problem. The number of iterations in Algorithm 1 set to . For each particle , the first coordinate and other the coordinates for . The constant gain approximation is evaluated according to the explicit integral formula (8).
Fig. 2 depicts the m.s.e Eq. 34 computed from running simulations. A summary of these results is as follows:
Fig. 2-(a) depicts the error as a function of the parameters and for a fixed number of particles . Also depicted is the error with the constant gain approximation. The constant gain error serves here as baseline.
For large values of , the bias error is dominant, and as the error asymptotes to the error for the constant-gain approximation. This is because (see Proposition 4.7) the diffusion map gain approaches the constant gain as . For small values of , the variance error dominates. According to Remark 4.5, the upper-bound for m.s.e is expected te be of the order . However, the numerical error in Fig. 2-(a) is observed to be . Therefore, the upper-bound in Remark 4.5 is not tight for this specific problem. 2. 2.
Fig. 2-(b) depicts the bias-variance trade-off as a function of number of particles for the fixed . It is not a surprise that the error gets better, for all choices of , as the number of particles increase. However, the optimal value of – at which the error is the smallest – is relatively insensitive to changes in . 3. 3.
Fig. 2-(c) depicts the error as function of for different values of . The dimension is fixed. The error goes down as and asymptotes to the bias. The is due to the variance error obtained in Proposition 3.4 and bias error is consistent with the conclusion of the Theorem 4.3. 4. 4.
Fig. 2-(d) depicts the run time comparison between the diffusion-map algorithm and the constant gain algorithm. The scaling for the diffusion-map algorithm is which is significantly more expensive than the scaling of the constant gain approximation.
Remark 5.1** (Selection of ).**
The numerical results in Fig. 2 suggest that there is an optimal value of such that the error is smallest. Given the fact that the constant gain approximation results in the limit as , an optimal choice of may be possible more generally. At the optimal value, one optimally trades-off the errors due to variance and bias. The difficulty, of course, is that the formula for this optimal choice is not known and may not even be possible in general settings. Instead, in the literature involving kernel methods, a popular heuristic is to set where (med) is the median value of all pairwise distances [12]. The justification is that, with such a choice, the matrix is not close to the identity matrix (which represents the degenerate case).
Remark 5.2**.**
*It is worthwhile to also examine the limit as while is fixed at a finite value. In this limit, the Markov matrix converges to the identity matrix. As a result, the solution to the fixed-point problem (31) is unbounded. However, in practice, value of is large but finite, because the equation (31) is solved in an iterative manner with finite number of iterations. With a finite value of and equal to identity, the gain function given by the formula (19) is zero. Consequently, the feedback correction for each particle is zero. *
5.2 Filtering example
Consider the following filtering problem:
[TABLE]
where , , , and is standard Brownian motion, independent of . The prior distribution is Gaussian and the observation function . For the static filtering problem, the posterior distribution is explicitly given by:
[TABLE]
Three filtering algorithms are implemented for this problem: (i) the FPF algorithm with the diffusion-map gain approximation; (ii) the FPF algorithm with the the constant gain approximation (similar to EnKF); (iii) a sequential importance resampling (SIR) particle filter [25]. The simulation parameters are as follows: The measurement noise . The simulation is carried out for time-steps with step-size . Both the algorithms use particles with identical initialization. For the diffusion-map approximation, the kernel bandwidth was set to , and number of iterations in 1 is set to .
The numerical results are depicted in Figure 3. The distribution of the particles along with the exact posterior distribution are depicted in Figure 3-(a). It is observed that the FPF algorithm with the diffusion map approximation provides a more accurate approximation of the posterior distribution. In contrast, the constant-gain approximation fails to reproduce the bimodal nature of the posterior distribution.
A quantitative estimate of the performance is provided in terms of a mean squared error (m.s.e.). in estimating the conditional expectation of the function . A Monte Carlo estimate of the m.s.e. is depicted in Figure 3-(b) with runs. At time , it is calculated according to
[TABLE]
At time , the empirical distribution of the particles is an accurate approximation of the prior distribution, because the particles are sampled i.i.d. from the prior distribution. Therefore, the m.s.e at is small. As time progress, the difference between the empirical distribution and the exact posterior becomes larger because the filter update is not exact. For FPF, as the time-step is small, the main source of the m.s.e. error is due to the error in the gain function approximation. Therefore, the diffusion map FPF with its more accurate approximation of the gain yields better m.s.e., compared to the EnKF using the constant gain approximation. The particle filter, like FPF with diffusion map approximation, is able to capture the bi-modal distribution. However, due to the stochastic noise, introduced from the resampling step, it admits larger error.
5.3 Benes filter
Consider the following filtering problem:
[TABLE]
where are one-dimensional stochastic processes, and are one-dimensional, independent, Brownian motions, is a known initial condition, and the constants . This filtering problem has a finite-dimensional analytical solution given by a mixture of two Gaussians [3]:
[TABLE]
where
[TABLE]
The three filtering algorithms, as in the previous example, are also implemented and evaluated for this problem. The simulation parameters are chosen according to the values used in [16]: , , , , . The simulations are carried out over the time horizon . The stochastic integrals are approximated with a first-order Euler scheme using the discretization step-size . For FPF with DM gain approximation, the kernel bandwidth is selected according to the rule described in Remark 5.1 and number of iterations in Algorithm 1 is .
The numerical results are depicted in Fig. 4. It is observed that the FPF with DM and constant gain approximations admit almost the same accuracy. The reason is that the exact bimodal posterior distribution quickly converges to an almost uni-modal distribution. This is because the weight of one of the mixture modes converges to zero. The accuracy of the SIR particle filter is poor because of the stochastic noise introduced from resampling step.
6 Conclusions and Directions for Future Work
In this paper, the diffusion map (DM) algorithm was presented for the problem of gain function approximation in the FPF. It was shown that the approximation error converges to zero in the limit as the number of particles and the kernel bandwidth parameter (Theorems 4.3 and 4.4). In the limit as , the gain obtained using the DM algorithm was shown to converge to the constant gain approximation (Proposition 4.7). Consequently, in this limit, the FPF using the DM algorithm reduces to an EnKF. This is an important property because it suggests a path to improve the performance of an EnKF algorithm by choosing an appropriate (finite) value of the parameter . The bounds, scalings and the numerical experiments described in this paper provide guidance on how to choose the parameter for large but finite . Some directions for future work are as follows:
Relaxing the assumptions: The analysis is based on Assumption A1 which is restrictive because it does not include the mixture of Gaussians. Relaxing this assumption, possibly as suggested in Remark 2.1, is one possible avenue of future work. 2. 2.
Error analysis for the FPF: The error analysis in this paper concerns primarily the convergence of function to the exact solution . Extending these results to include the convergence analysis of the gain to the exact gain is important for the complete error analysis of the FPF with finitely many particles.
Appendix A Exact semigroup and and its diffusion map approximation for the Gaussian case
In this section, we provide explicit formulae for the exact semigroup and its diffusion map approximation , for the special case when the density is a Gaussian . For the Gaussian case, the semigroup is the Ornstein-Uhlenbeck semigroup [5, Sec. 2.7.1] an its spectral representation is obtained in terms of the Hermite polynomials. For notational ease, after an appropriate change of coordinates, we assume and where are ordered eigenvalues of .
Definition A.1**.**
The Hermite polynomials are recursively defined as
[TABLE]
*where the prime ′ denotes the derivative. *
Proposition A.2**.**
Suppose the density is Gaussian with the variance and . Then {romannum}
The exact semigroup and the diffusion map admit the following integral representations:
[TABLE]
where for .
The operators and each have a unique invariant Gaussian density given by and , respectively, where with for .
The eigenvalues and the associated eigenfunctions are as follows:
[TABLE]
for .
*The operator norm and .
Proof A.3**.**
*Omitted. See [62, Prop. 1]. *
Appendix B Proof of Proposition 2.4
Based on the use of the spectral representation Eq. 10, the weak solution of the Poisson equation is readily seen to be
[TABLE]
This solution Eq. 44 also satisfies the fixed-point equation Eq. 12 because
[TABLE]
The uniqueness of the solution to the fixed-point equation Eq. 12 follows from the contraction mapping principle because .
Appendix C Derivation of the linear form of the gain Eq. 19
By a direct calculation,
[TABLE]
which evaluated at yields
[TABLE]
Using the definitions Eq. 18 for , and Eq. 20 for and ,
[TABLE]
Appendix D Proof of Proposition 4.1
{romannum}
is a Markov matrix because a.s. and
[TABLE]
The stationary distribution is because
[TABLE]
All entries of the Markov matrix are positive. Hence the Markov chain is irreducible and aperiodic. Therefore, the stationary distribution is unique. It is reversible because
[TABLE]
Denote . Then a.s. Therefore, , and is thus a contraction on [56, Ch. 5]). It follows, from the contraction mapping principle, that the fixed point equation Eq. 16 has a unique solution.
Evaluating the definition Eq. 17 at concludes because,
[TABLE]
Therefore solves the fixed-point equation Eq. 30, because
[TABLE]
Appendix E Proof of the Proposition 3.3
Proof E.1**.**
{romannum}
Let and as defined in Eqs. 22a and 22b. To obtain the representation Eq. 23 for the semigroup , consider the unitary transformation [5, Sec. 1.15.7]:
[TABLE]
Therefore, for any function ,
[TABLE]
where the stochastic representation (second equality) follows from the Feynman-Kac formula; is a Brownian motion initialized at . Setting ,
[TABLE]
which is the representation Eq. 23.
Next, the representation Eq. 24 is obtained. Using the definitions, (13) of and Eqs. 22a and 22b of and ,
[TABLE]
where the final equality follows from using the stochastic representation of the heat semigroup . The representation Eq. 24 is obtained by iterating this formula times.
Without loss of generality, upon a change of coordinates, assume and in Assumption A1. Using the definitions
[TABLE]
Now, . So, the main calculation is to approximate . Using the definition
[TABLE]
where , and is the semigroup associated with the PDE .
The Taylor expansion of , about , is expressed as
[TABLE]
where , and denotes is the Hessian matrix of .
Using the property that , and the assumption (A1) that , we conclude that . Therefore,
[TABLE]
The asymptotic expansion of , as , is obtained as
[TABLE]
where the remainder term has at most linear growth as .
Substituting the asymptotic expression for in Eq. 46,
[TABLE]
where the remainder error term has at most quadratic growth as . This concludes the proof of approximation Eq. 25a.
Based on this above calculation, the following estimate for an upper bound of the function is obtained (it is used in the proof of Proposition 4.2):
[TABLE]
where recall .
Next, the approximation Eq. 25b is derived. Using the definition
[TABLE]
By repeating the steps, just used to approximate , it is shown
[TABLE]
where
[TABLE]
Therefore,
[TABLE]
where the error term has at most quadratic growth as . This concludes the proof of the approximation Eq. 25b.
Based on this above calculation, the following estimate for a lower bound of the function is obtained (it is used in the proof of Proposition 4.2):
[TABLE]
where , and (where recall ).
Let denote the semigroup for the weighted Laplacian with the density . We break the error into two parts:
[TABLE]
The bounds for the two terms on the right-hand side are derived in the following two steps:
Step 1. Using the stochastic representation Eq. 23-Eq. 24,
[TABLE]
where . By the Cauchy-Schwartz inequality
[TABLE]
Next we obtain a bound for . Upon using the inequality ,
[TABLE]
where . Now, is finite because, as , (Assumption A1) and (by (25b)). As a result, by triangle inequality, .
The expectation of the first term is bounded as follows:
[TABLE]
where the second inequality follows from the bound for some constants (see Eq. 25b).
The expectation of the second term in Eq. 49 is bounded as follows:
[TABLE]
where the Taylor expansion of is used to obtain the first inequality, and for the second inequality, Assumption (A1) is used to bound and .
Putting together the two expectation bounds ,
[TABLE]
where is a constant that only depends on . Upon taking the norm
[TABLE]
Step 2. Because and are semigroups with generators and , respectively, we have the identity: . Upon taking the norm of both sides, using the triangle inequality, because is contraction on ,
[TABLE]
Now,
[TABLE]
where the identity is used in the first step, the Cauchy-Schwartz inequality in the second step, and the bounds and in the third step.
Combining the two sets of bounds in steps 1 and 2, one obtains Eq. 26.
Appendix F Proof of the Proposition 4.2
Proof F.1**.**
(i) The Lyapunov condition Eq. 36a, known as DV(3) of [39], is the necessary and sufficient condition for geometric ergodicity (and in fact the stronger -uniform ergodicity) [46, Thm. 15.0.1]. The distribution is invariant because ,
[TABLE]
(ii) The invariant density is reversible because
[TABLE]
The spectral gap follows from Lyapunov condition Eq. 36a and the fact that the chain is reversible [52, Thm 2.1]. The spectral gap is denoted as .
(iii) The solution satisfies the bound:
[TABLE]
It remains to verify the Lyapunov condition Eq. 36a: Using Eq. 24
[TABLE]
where the second inequality follows from using the lower bound derived in Eq. 48.
We now claim that
[TABLE]
for where and are defined using the recursions:
[TABLE]
Assuming for now that the claim is true
[TABLE]
An upper-bound for and a lower-bound for are obtained as follows:
For the sequence ,
[TABLE] 2. 2.
For the sequence ,
[TABLE]
Therefore,
[TABLE]
It then follows
[TABLE]
*Upon using the two bounds *
[TABLE]
where the second inequality follows from using the upper bound derived in Eq. 47. The following estimates are obtained for constants
[TABLE]
It remains to prove the claim Eq. 50. The constants and for are easily verified by direct evaluation and for ,
[TABLE]
The minorization inequality Eq. 36b is obtained next. For :
[TABLE]
where
[TABLE]
*because . *
Appendix G Proof of the Theorem 4.3
Proof G.1**.**
{romannum}
The existence of the solution is proved in Proposition 4.2.
We break the error into two parts:
[TABLE]
where is the solution to the fixed point equation with the exact semigroup . The bounds for the two terms on the right-hand side are derived in the following two steps:
Step 1. Iterating the formula for times yields,
[TABLE]
and subtracting this from Eq. 35 gives
[TABLE]
This forms a (discrete) Poisson equation whose solution exists and is bounded according to Proposition 4.2:
[TABLE]
where we used in the second step. This is true because using the formula Eq. 25a.
It remains to bound the three terms inside the bracket in Eq. 51:
[TABLE]
*by using the error estimates Proposition 3.3-(iii). Therefore, *
[TABLE]
Step 2. Both and are solutions with the exact semigroup . Using the spectral representation (10),
[TABLE]
Therefore,
[TABLE]
and thus .
Combining the estimates from steps 1 and 2,
[TABLE]
Appendix H Proof of the Proposition 3.4
Proof H.1**.**
Denote and express:
[TABLE]
where
[TABLE]
{romannum}
To prove the part-(i) of the Proposition 3.4, the strategy is to show that as the stochastic terms converge to zero almost surely. We do this in two steps below, in step 1, and in step 2.
Step 1:* Convergence of and follows from direct application of the strong law of large numbers (SLLN). The SLLN applies because the summand for and are independent and identically distributed (i.i.d) and moreover have finite variance:*
[TABLE]
where we used .
Step 2:* In order to show the almost sure convergence of and to zero, we first show that in the limit as ,*
[TABLE]
with probability larger than for any arbitrary choice of . Assuming for now that the claim is true, it then follows
[TABLE]
with probability larger than . The term inside the bracket converges almost surely to its limit , by SLLN, because
[TABLE]
The proof that is completed by an application of the Borel-Cantelli lemma. Indeed, choose a sequence given by . Then where . Because , then . The proof of is identical.
It remains to prove the claim Eq. 54, which can be established using the Bernstein inequality as follows. We have for any :
[TABLE]
The random variables are i.i.d, bounded by , and the variance
[TABLE]
Therefore by Bernstein inequality,
[TABLE]
with probability higher than . The result is obtained by union bound for and .
Collecting the estimates Eqs. 52, 53, and 55 and application of the Bernstein inequality yields:
[TABLE]
with probability larger than . Therefore one obtains the bound:
[TABLE]
with probability larger than . Upon squaring and integrating both sides with respect to proves the rate:
[TABLE]
Appendix I Proof of the Theorem 4.4
In the proof of Theorem 4.4, the function space of interest is , the Banach space of continuous bounded functions on (a compact set) equipped with the norm. Also, define the space , as subspace of functions in with zero mean. Consider and as linear operators from to .
Part-(i) has already been proved as part of the Proposition 4.1. The proof of part (ii) relies on the verification of the following three conditions: {romannum}
The family of operators is collectively compact, as linear operators on .
For any function ,
[TABLE]
The operator is a bounded operator on .
Once these three conditions have been verified, the convergence result Eq. 39 follows from a standard result in the approximation theory of the numerical solutions of integral equations [34, Thm. 7.6.6].
The proof of the three conditions is as follows: {romannum}
Condition (i) holds if the set is relatively compact. Relative compactness follows from an application of the Arzela-Ascoli theorem. In order to apply Arzela-Ascoli theorem, we need to show that is uniformly bounded and equicontinuous. The two conditions hold because
[TABLE]
for all and such that . The detailed calculation to obtain the second inequality appears at the end of the proof.
Fix a function . From Proposition 3.4-(i), we know that converges to almost surely pointwise for all . Because is compact and is equicontinuous, pointwise convergence implies uniform convergence Eq. 56.
From parts (i) and (ii) above, it can be concluded that is a compact operator. Therefore, using the Fredholm alternative theorem, in order to show is bounded, it is enough to show that is injective. The injectivity property is shown by contradiction. Suppose there exists a function such that . Let be a point that achieves the maximum of the function . Such a point exists because is continuous and is compact. Evaluating at yields
[TABLE]
Because and , this implies for all . Therefore, the function is a constant. But the only constant function in is zero. Hence is injective and its inverse is bounded.
It remains to prove the equicontinuity inequality Eq. 57 which is done next:
[TABLE]
where the last inequality is obtained as follows
[TABLE]
where is the diameter of .
Appendix J Proof of Proposition 4.7
{romannum}
Consider first the finite- case. In the asymptotic limit as , we have . Therefore,
[TABLE]
and
[TABLE]
It is also easy to see, e.g., by using a Neumann series solution, that in the asymptotic limit as , the solution of the fixed-point equation Eq. 31 is given by
[TABLE]
Therefore,
[TABLE]
and using the gain approximation formula Eq. 19,
[TABLE]
The calculations for the kernel formula are entirely analogous. In the asymptotic limit as ,
[TABLE]
and, using to denote the coordinate function and to denote function multiplication, the gain approximation formula Eq. 41 evaluates to
[TABLE]
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] P. M. Anselone , Collectively compact operator approximation theory and applications to integral equations , Prentice Hall, 1971.
- 2[2] K. Atkinson , A survey of numerical methods for the solution of Fredholm integral equations of the second kind , Soc. for Industrial and Applied Mathematics, Philadelphia, PA, 1976, https://cds.cern.ch/record/107092 .
- 3[3] A. Bain and D. Crisan , Fundamentals of stochastic filtering , vol. 3, Springer, 2009, https://doi.org/10.1007/978-0-387-76896-0 . · doi ↗
- 4[4] D. Bakry, F. Barthe, P. Cattiaux, and A. Guillin , A simple proof of the Poincaré inequality for a large class of probability measures including the log-concave case , Electron. Commun. Probab, 13 (2008), pp. 60–66, https://doi.org/10.1214/ECP.v 13-1352 . · doi ↗
- 5[5] D. Bakry, I. Gentil, and M. Ledoux , Analysis and geometry of Markov diffusion operators , vol. 348, Springer Science & Business Media, 2013, https://doi.org/10.1007/978-3-319-00227-9_3 . · doi ↗
- 6[6] M. Belkin , Problems of learning on manifolds , Ph D thesis, The University of Chicago, 2003. AAI 3097083.
- 7[7] M. Belkin and P. Niyogi , Convergence of Laplacian eigenmaps , in Advances in Neural Information Processing Systems, 2007, pp. 129–136, https://doi.org/10.7551/mitpress/7503.003.0021 . · doi ↗
- 8[8] K. Bergemann and S. Reich , An ensemble Kalman-Bucy filter for continuous data assimilation , Meteorologische Zeitschrift, 21 (2012), pp. 213–219, https://doi.org/10.1127/0941-2948/2012/0307 . · doi ↗
