A numerical approach to Kolmogorov equation in high dimension based on Gaussian analysis
Franco Flandoli, Dejun Luo, Cristiano Ricci

TL;DR
This paper introduces a novel numerical method for solving high-dimensional Kolmogorov equations by leveraging Gaussian analysis, providing an alternative to Monte Carlo methods with proven convergence.
Contribution
The paper develops a Gaussian-based series expansion approach for high-dimensional Kolmogorov equations, inspired by SPDE structures, with convergence proof and numerical validation.
Findings
Convergent series expansion for solutions in high dimensions.
Numerical tests demonstrate efficiency over traditional methods.
Applicable to infinite-dimensional stochastic systems.
Abstract
For Kolmogorov equations associated to finite dimensional stochastic differential equations (SDEs) in high dimension, a numerical method alternative to Monte Carlo simulations is proposed. The structure of the SDE is inspired by stochastic Partial Differential Equations (SPDE) and thus contains an underlying Gaussian process which is the key of the algorithm. A series development of the solution in terms of iterated integrals of the Gaussian process is given, it is proved to converge - also in the infinite dimensional limit - and it is numerically tested in a number of examples.
| Parameter | Value | Description |
|---|---|---|
| dimension of the problem | ||
| parameter of the nonlinearity , Polynomial case | ||
| values where the solution is computed | ||
| noise | ||
| final time of computation for | ||
| threshold for the initial condition |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
A numerical approach to Kolmogorov equation in high dimension based on Gaussian analysis
Franco Flandoli111Email: [email protected]. Scuola Normale Superiore, Piazza dei Cavalieri, 7, 56126 Pisa, Italy. Dejun Luo222Email: [email protected]. RCSDS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China, and School of Mathematical Sciences, University of the Chinese Academy of Sciences, Beijing 100049, China. Cristiano Ricci333Email: [email protected]. University of Florence, Italy.
Abstract
For Kolmogorov equations associated to finite dimensional stochastic differential equations (SDEs) in high dimension, a numerical method alternative to Monte Carlo simulations is proposed. The structure of the SDE is inspired by stochastic Partial Differential Equations (SPDE) and thus contains an underlying Gaussian process which is the key of the algorithm. A series development of the solution in terms of iterated integrals of the Gaussian process is given, it is proved to converge - also in the infinite dimensional limit - and it is numerically tested in a number of examples.
Keywords: Kolmogorov equation, numerical solution, iteration scheme, Gaussian process
1 Introduction
Kolmogorov equations are parabolic equations with a structure directly related to stochastic differential equations (SDEs). The SDEs considered here are in a finite dimensional space but they are inspired by the spatial discretization of stochastic Partial Differential Equations (SPDE). When the noise is additive and the nonlinearity is time-independent, a general form of such SDEs is
[TABLE]
where , is a Brownian motion in (namely where the ’s are independent real valued Brownian motions), defined on a probability space with a filtration , is a positive real number measuring the strength of the noise, is a positive definite symmetric matrix (the so called covariance matrix of the noise) describing the spatial structure of the noise and is its square root, is a matrix and is a function with the degree of regularity specified below. Obviously we could include the scalar inside the matrix but for certain practical arguments it is useful to distinguish between them. The solution is a continuous adapted process in . The associated Kolmogorov equation is
[TABLE]
where , and denote respectively the vector of first partial derivatives and the matrix of second partial derivatives, is the trace of the matrix and denotes the scalar product in . Both for the SDE and the Kolmogorov equation we have used notations which may be adapted to the infinite dimensional case, when is replaced by a Hilbert space (see Section 2 for the general theory); however, the aim of this work is numerical and all objects in the introduction will belong to . The link between the Kolmogorov equation and the SDE is
[TABLE]
where denotes the mathematical expectation on and is the solution of the SDE above, where the initial condition is explicitly indicated. Several elements of theory both in finite and infinite dimensions for SDEs and associated Kolmogorov equations can be found in many books, like [6, 8, 9, 16, 17].
Solving the Kolmogorov equation with suitable initial condition is a way to compute relevant expected values and probabilities associated to the solution of an SDE. For instance, when , is the probability that the solution exceeds a threshold :
[TABLE]
The classical method of computing these expected values is the Monte Carlo method (with important variants, see for instance [13, 19]): several realizations of the process are simulated by solving the SDE – typically by Euler method – and then the corresponding values of are averaged. Going beyond this strategy is a fundamental issue, due to its limitations in relevant applications like Geophysics and Climate change projections [15], especially concerning extreme events. The question is whether Kolmogorov equation can be efficiently solved numerically without using the simulation of the SDE. But the problem is that the dimension is extremely high in these examples and common numerical methods for solution of parabolic equations already require strong computational power when , [5, 18]. A grid of points in , repeated for all dimensions, give rise to grid points, numerically impossible when, for instance, (which still would be an extremely poor approximation). Spectral methods seem to meet the same restrictions: is the cardinality of basis elements obtained by tensorization of basis elements for each space variable.
The problem of dimensionality, the limitations of present methodologies and several motivations are recalled in two recent works [2, 14] which also aim to go beyond Monte Carlo and propose a method based on deep artificial neural networks. We address to these brilliant works for other comments on the problem, see also [7, Introduction]. The approach developed here is however completely different.
Our aim is to take advantage of the probabilistic structure of the problem to devise numerical schemes for the Kolmogorov equation, in particular using Gaussian analysis. We implement a perturbative scheme which links the solution of Kolmogorov equation to a Gaussian process, the solution of the linear stochastic equation
[TABLE]
The idea comes from the theoretical investigations of infinite dimensional Kolmogorov equations associated to SPDEs, see for instance [9, 10]. We modify and adapt that idea giving an explicit formula in terms of a series of Gaussian integrals. We provide here a first glance at the strategy by writing the final formula:
[TABLE]
where
[TABLE]
and for
[TABLE]
The matrix will be defined in the next sections, see (2.3); it is easily computed by and , and it depends on the parameters and . A theoretical analysis of this series is made, proving the following result.
Theorem 1.1**.**
Assume that and are bounded. Then, under suitable conditions on and (see Hypothesis 2.1 for details), we have the following uniform estimate:
[TABLE]
where is the Gamma function, is a constant and the parameter in (iv) of Hypothesis 2.1.
This theorem sustains the numerical method and stresses the independence on the dimension of certain issues of the method (obviously others, like getting a sample of , have a cost which increases with ). When is replaced by a Hilbert space (and below we shall formulate the theorem with assumptions in a Hilbert space) it contains also some theoretical novelties with respect to the literature, especially because it provides an explicit formula.
The numerical evaluation of the terms is made here, in this paper, by Monte Carlo method based on a sample of the process obtained by solving the linear SDE by Euler method. These are the most obvious choices, but other possibilities exist, since is a centered Gaussian process with known covariance function. A main strategy invoked here is to store once for ever a large and accurate sample of (this requires the pair to be given) and use it later in the formula for different values of the other parameters, and even .
This new method is aimed to replace direct Monte Carlo simulations. We should therefore accurately compare them. If the purpose is to make one single computation, classical Monte Carlo wins: the Gaussian method above still requires Monte Carlo simulations of the linear problem, which is less expensive than the nonlinear one but then one has to compute possibly several terms ; some experiments clearly show that classical Monte Carlo is less expensive for a comparable degree of precision. The advantage comes when we want to vary parameters, since the Gaussian method for given allows to store a possibly expensive sample of the process and reuse it for several values of the parameters, just having to compute the averages over the Gaussian sample which give us the terms . On the contrary, classical Monte Carlo method requires to repeat the simulation of the nonlinear problem for each new value of the parameters. By “parameters”, as we have already mentioned above, we mean . Let us comment on the interest in changing them.
The interest in changing is obvious. In certain applications it is necessary to change the initial condition and compare or collect the results. We have in mind for instance the ensemble methods used in weather prediction where the initial condition is uncertain, a first guess is made on the basis of physical observations, but then the initial condition is perturbed in various directions and the final results averaged by suitable methods. See also [2, 14], where the need to change is stressed.
Changing the strength of the noise is a very important issue, related also to Large Deviation Theory. We have to advise that the precision of our simulations degenerates as , or the number of iterates needed to maintain a reasonable precision blows-up, but at least one can detect some tendency by moving in a finite range without arriving to too small values.
Concerning the change of function , unfortunately the main comment is in favor of Monte Carlo: having at disposal a sample of the process immediately gives a way to compute for different functions . Hence the best we can say on this issue is that our formula allows for such computations with a moderate additional effort – but not with an improvement over Monte Carlo.
Finally, changing the nonlinearity is of theoretical interest for the investigation of the performances of the method, and in applications it may be of interest in those – very common – cases when some parameters of are not precisely known and different simulations may be useful for comparison or for ensemble averaging methods performed over the range of those parameters.
Let us finally come to a brief description of numerical results. In Section 3, we present some numerical results based on the method proposed here in the finite dimensional settings with . The results, even if not fully satisfactory yet, should be compared with the fact that the innovative attempts to solve the Kolmogorov equation in by direct methods, see [7], are often restricted to dimensions smaller than . Large dimension is therefore a very difficult problem that deserves strong effort for improvement, and some of our results – although not in all examples – are quite promising.
As a final comment, let us explicitly mention that the class of Kolmogorov equations studied here is particular, because of the additive and very non-degenerate noise and because we have treated only relatively mild nonlinearities. We have not considered relevant cases from fluid mechanics which have more severe nonlinearities and activation of more scales; after a few initial tests on dyadic models – we point in particular to the recent models on trees which may be very relevant for turbulence theory, see [1, 3, 4] – it was clear that covering these examples with this approach requires further research and improvements. Extension to multiplicative transport noises [11, 12] is another challenging open question.
2 The iteration scheme for Kolmogorov equations on Hilbert spaces
In this section we work in an infinite dimensional separable Hilbert space and study the iteration scheme for the Kolmogorov equation:
[TABLE]
Here is an unbounded linear operator, is a nonnegative self-adjoint bounded linear operator on , is a nonlinear measurable mapping and is a real valued measurable function. In this section plays the role of to simplify notation. In the following we write for the Banach space of bounded linear operators on with the norm .
Throughout this section we assume the following conditions:
Hypothesis 2.1**.**
- (i)
* is the infinitesimal generator of a strongly continuous semigroup .*
- (ii)
* is a nonnegative self-adjoint operator in satisfying , and for any the linear operator*
[TABLE]
is of trace class.
- (iii)
We have for any .
- (iv)
Letting , we assume there exist and such that
[TABLE]
The assumptions (i)–(iii) are quite standard in the literature, see for instance [8, Hypothesis 2.1 and 2.24]. The operator appeared in the introduction has the form
[TABLE]
we remark that, in the setting of the introduction, the operator in (2.2) should be replaced by when computing . The following example is taken from [8, Example 2.5] which verifies all the assumptions.
Example 2.2**.**
Let with . We choose , and
[TABLE]
where is the Laplacian operator with Dirichlet boundary condition. is a self-adjoint negative operator in , and
[TABLE]
where for , and
[TABLE]
Choose , so that
[TABLE]
For any , if , then
[TABLE]
So (ii) is satisfied.
Next, (iii) can be checked by explicit computations. Moreover,
[TABLE]
From this we deduce that
[TABLE]
where
[TABLE]
Thus (iv) holds with .
We also need the following technical conditions.
Hypothesis 2.3**.**
The initial datum and the nonlinear part in (2.1) are bounded and measurable.
This section is organized as follows. In Subsection 2.1, we recall some basic facts in Gaussian analysis on Hilbert space and give the formula for the first term of the iteration (2.8). We give in Section 2.2 the details for calculating the second term , which will help us to guess and prove the formula for general terms in Section 2.3. In the last part, we estimate the uniform norm of and show the convergence of the iteration scheme. The limit is the unique mild solution of (2.1), see Theorem 2.15.
2.1 Some preparations
Let be a cylindrical Brownian motion on :
[TABLE]
where is a complete orthonormal basis of and is a family of independent one dimensional standard Brownian motions defined on some probability space . Under the conditions (i) and (ii) in Hypothesis 2.1, the linear SDE
[TABLE]
has a unique solution with the expression
[TABLE]
where is the stochastic convolution:
[TABLE]
For any , is a centered Gaussian variable on with covariance operator . We denote its law by . Accordingly, the law of is denoted as . Recall that for any , \big{\langle}h,Q_{t}^{-1/2}W_{A}(t)\big{>} is a centered real Gaussian variable with variance
[TABLE]
We shall write for the space of bounded measurable functions on and the space of Fréchet differentiable functions, bounded with bounded derivatives. When , its Fréchet derivative will be denoted by . For any and , let
[TABLE]
This defines a Markov semigroup on . We have the following important result which implies is strong Feller (see [8, Proposition 2.28] for a proof).
Proposition 2.4**.**
Assume the conditions (i)–(iii) in Hypothesis 2.1. Then for all and , we have and for any ,
[TABLE]
Moreover,
[TABLE]
Using the semigroup , the mild formulation of the Kolmogorov equation (2.1) is
[TABLE]
This suggests us to consider the iterative scheme:
[TABLE]
with . We define and
[TABLE]
then the new functions satisfy the iteration procedure:
[TABLE]
Before concluding this section, we show how to obtain the first term . Since , Proposition 2.4 implies for any , and thus . Denote by the filtration generated by the cylindrical Brownian motion .
Lemma 2.5**.**
It holds that
[TABLE]
Proof.
Use the property of conditional expectation:
[TABLE]
where the second step follows from the Markov property. Again by the Markov property,
[TABLE]
where the second step is due to (2.5). Substituting this equality into the previous one we obtain the identity. ∎
The above lemma implies
Corollary 2.6**.**
For any and ,
[TABLE]
Moreover,
[TABLE]
and
[TABLE]
Proof.
The formula (2.9) follows directly from Lemma 2.5. Next, by the definition (2.8) of the iteration, for any and ,
[TABLE]
where the last inequality follows from (2.6). Therefore,
[TABLE]
which yields the estimate on . The inequality (2.10) implies that for all , hence by Proposition 2.4, and
[TABLE]
Finally, by (2.6),
[TABLE]
which, together with (2.10), gives us the last estimate. ∎
2.2 The term
In this part, we compute the second term in the iteration to illustrate the ideas. First we prove
Lemma 2.7**.**
One has
[TABLE]
Proof.
By Corollary 2.6, for any , and
[TABLE]
Recall that (2.10) implies , thus by Proposition 2.4,
[TABLE]
According to the proof of Lemma 2.5, we have
[TABLE]
Note that \big{\langle}\Lambda(t-s)B(x),Q_{t-s}^{-1/2}\big{(}Z^{x}_{t-s}-e^{(t-s)A}x\big{)}\big{\rangle} is -measurable. Substituting this equality into the one above and using the property of conditional expectation, we obtain the desired result. ∎
Now we are ready to present the expression and estimates for the second iteration.
Proposition 2.8**.**
For any and ,
[TABLE]
Furthermore,
[TABLE]
and
[TABLE]
Proof.
By Lemma 2.7, for any and ,
[TABLE]
We have
[TABLE]
where the second step follows from the Markov property. Therefore,
[TABLE]
Next, by the definition of and the last inequality in Corollary 2.6,
[TABLE]
This immediately implies
[TABLE]
and we obtain the estimate on . Moreover, by Proposition 2.4,
[TABLE]
which, combined with (2.11), gives us the second estimate. ∎
2.3 The general terms
In order to do further iteration, we rewrite the formula in Proposition 2.8 as
[TABLE]
Moreover, denoting by , then we have
[TABLE]
From this we can guess the general formulae.
Theorem 2.9**.**
Let . For any ,
[TABLE]
Moreover,
[TABLE]
and, letting ,
[TABLE]
Proof.
We proceed by induction. Indeed, in view of the proofs in Section 2.2, we shall also prove inductively the formula
[TABLE]
where and . The discussions in Sections 2.1 and 2.2 show that the assertions on hold for , and the above formula of holds with . Now we assume the assertions on (resp. on ) hold for (resp. for ), and try to prove them in the next iteration.
By the induction hypotheses, we have for all and thus, by the definition of the iteration (2.8), with
[TABLE]
where . Proposition 2.4 implies for all , and from the formula
[TABLE]
we deduce readily the estimates on and .
Next we prove the formula for (note that the induction hypothesis gives us the expression of ). We have
[TABLE]
where we used Proposition 2.4 in the last step. By the induction hypothesis,
[TABLE]
where and . Therefore, by the Markov property,
[TABLE]
Inserting this identity into (2.13) and noticing that \big{\langle}\Lambda(t-s)B(x),Q_{t-s}^{-1/2}(Z^{x}_{t-s}-e^{(t-s)A}x)\big{\rangle} is measurable with respect to , we obtain
[TABLE]
Renaming as gives us the formula of in the new iteration for all and .
Finally we prove the expression for . We have
[TABLE]
Using the formula we have just proved for and the Markov property, we can obtain the expression for in a similar way as above. ∎
We give a slightly different formula which is more appropriate for numerical purpose.
Corollary 2.10**.**
For any ,
[TABLE]
where . Accordingly,
[TABLE]
and, setting ,
[TABLE]
Proof.
We change variables as follows:
[TABLE]
The domain of integration becomes
[TABLE]
and . Therefore, by (2.12),
[TABLE]
In the product, letting , we get the desired formula (2.14). The proofs of the two estimates are similar. ∎
Remark 2.11**.**
Due to the convolution structure (2.8), it seems that (2.14) is not suitable for the induction argument in the proof of Theorem 2.9.
2.4 Convergence of the iteration scheme (2.8)
We need the following technical result, where we use the Gamma function :
[TABLE]
Lemma 2.12**.**
Assume and . Let and . One has
[TABLE]
and
[TABLE]
Proof.
First we prove
[TABLE]
where is the Beta function:
[TABLE]
We proceed by induction. For , noting that , we change the variable and get
[TABLE]
Therefore the equality holds when . Now suppose the equality holds for , we prove it for . By the induction hypothesis,
[TABLE]
thus, noticing that ,
[TABLE]
We have, by changing variable ,
[TABLE]
Substituting this result into the previous one gives us the identity (2.15).
Next, it is well known that
[TABLE]
Therefore,
[TABLE]
Combining this with (2.15) we obtain the desired formula.
The proof of the second identity is similar, by first establishing the identity
[TABLE]
We omit the details here. ∎
As a consequence, we have the following estimates.
Corollary 2.13**.**
Under the Hypotheses 2.1 and 2.3, for any and ,
[TABLE]
and
[TABLE]
Proof.
The case follows directly from (2.6). Combining Lemma 2.12 and Corollary 2.10, we obtain the general cases. ∎
Now we can prove the existence of limit for the iteration scheme (2.8).
Proposition 2.14**.**
Assume the Hypotheses 2.1 and 2.3. For any , the series
[TABLE]
converge uniformly on . Moreover, for any , the series
[TABLE]
converge uniformly on .
Proof.
We only prove the first assertion; the proof of the second one is similar. By Corollary 2.13 and using the ratio test, it is sufficient to show that
[TABLE]
This follows from elementary calculations. Indeed, setting for simplicity of notation,
[TABLE]
where is the decimal part of . Using the simple inequality for all , we have
[TABLE]
Hence,
[TABLE]
Note that the first part on the right hand side tends to as , while the last part is uniformly bounded in , thus we conclude the result. ∎
Thanks to Proposition 2.14, we can define the limit
[TABLE]
moreover, for any , one has and
[TABLE]
which holds uniformly on for any . Finally we can prove the main result.
Theorem 2.15**.**
The limit is the unique solution to the Kolmogorov equation (2.1) in the following sense:
- (a)
for any , is uniformly bounded for , and for any ;
- (b)
for any , one has ;
- (c)
it satisfies the mild formulation (2.7) for any and .
Proof.
Obviously our limit verifies (a). Next,
[TABLE]
Therefore,
[TABLE]
which shows that (b) is also satisfied. Moreover, for any and ,
[TABLE]
This implies the integral in the signs of absolute value makes sense.
It remains to check that verify (2.7). By the iteration scheme (2.8), one has, for any ,
[TABLE]
The left hand side converges uniformly to on for any . It suffices to show the uniform convergence of the right hand side. We have
[TABLE]
Similarly to the calculations in (2.17), we can show that the right hand side vanishes as goes to infinity. Therefore we let on both sides of (2.18) and conclude that satisfies (2.7) uniformly in .
Finally we prove the uniqueness of solutions. Suppose and are two solutions to (2.1) with the properties (a)–(c). Then, for any and ,
[TABLE]
Therefore,
[TABLE]
Moreover, by Proposition 2.4,
[TABLE]
Hence,
[TABLE]
Under Hypothesis 2.1-(iv), there is some such that . Then
[TABLE]
Combining this with (2.19) we see that for any . Next, by the semigroup property, it is easy to show that, for ,
[TABLE]
Repeating the above procedure we can prove the uniqueness on the interval and so on. Thus we complete the proof. ∎
3 Numerical Simulations
In this section we propose some experiment of the iteration scheme (2.8) studied in Section 2 in the finite dimensional setting. We have in mind the framework of Example 2.2, i.e. . Since we are in the finite dimensional setting this choice corresponds to take as the diagonal matrix where . Moreover we consider the matrix where is the identity matrix over , and the parameter will be specified below (see Table 2 for reference parameters).
We will consider two main classes of examples as a benchmark for our approximation scheme. First, we consider the nonlinear vector field
[TABLE]
i.e. we apply the sine function to all the components. This nonlinearity will be the easier one of our examples since it is close to linear, at least for small values of . We will also consider some variation of the previous example, made by
[TABLE]
where is the skew symmetric matrix
[TABLE]
i.e. the Toeplitz matrix with all one above the diagonal and minus one below. This example is more complex than the previous one. It is significant since it deals with skew symmetric matrices, inducing rotations, which are a first simple step in the direction of fluid dynamics. The vector field is also multiplied by the function in order to make example (3.2) nonlinear. Notice that this last example is not covered by our present theory, since it does not satisfy Hypothesis 2.3. However, even if (3.2) is not bounded, it satisfies a linear growth condition. We hope to improve our theory and the generality of the assumptions in such direction in a future research, and limit ourselves to some numerical experiments for the present work.
Second, we consider the following class of polynomial nonlinearities
[TABLE]
where is fixed. Note that this example appeals to the one dimensional case
[TABLE]
for which the dynamical system
[TABLE]
has the singleton as a stable attractor. The reason behind the example (3.3) is the following: it is close to a polynomial nonlinearity, so that it makes a significant test case; at the same time, the normalization by the factor makes it a bounded operator, so that it fulfills Hypothesis 2.3.
In all the examples above we adopt the following choice of initial condition
[TABLE]
where the parameter is set to (see Table 2).
3.1 Approximation schemes
Standard Monte Carlo approach.
Since an explicit solution for Equation (1.2) is not available we will always compare to the solution obtained by means of Monte Carlo simulation of the nonlinear process :
[TABLE]
where is the number of samples considered, and the processes are independent copies of . To compute samples of the process we use the Euler-Maruyama scheme with a very fine time step in order to get a good approximation to be used as a comparison. The solution computed by (3.4) will always be referred to in what follows as the reference case.
Numerical iteration scheme.
Under our assumption, since and are diagonal, we can rewrite the equations for the processes and in a simple way: for ,
[TABLE]
and
[TABLE]
We remark that, differently from , the process depends also on the parameter , but we do not explicitly write for ease of notation. Note that the process depends only on the operators . This opens the possibility of computing , and hence also , for many values of without repeating the computations for . The same reasoning holds for different values of , see Figure 6. Note also that this strategy cannot be applied to the process since in that case the problem is nonlinear.
Once realizations of the process are computed, we can proceed with the iteration algorithm (2.8). In order to compute numerically the quantity appearing in Theorem 2.9 one needs to be able to compute first
[TABLE]
Since and are diagonal and explicit (see the beginning of this section), one has
[TABLE]
[TABLE]
and thus,
[TABLE]
Hence, when integrating expression (3.6), by change of variable we have
[TABLE]
Changing variable provides a significant advantage when performing numerical integration. In fact it is more complex to compute than (resp. ) since is random and hence we would have been obliged to reverse the time for every sample of the process. On the other hand the matrix (resp. ) is deterministic so that changing time can be done only once.
Moreover, thanks to Corollary 2.10, it is possible to compute with a single time integration from the previous step. Introduce
[TABLE]
and notice that, due to Equation (2.14), we have
[TABLE]
Since
[TABLE]
once we have computed , computing is a matter of a single integration. This is really crucial because, otherwise, by using the direct expression (2.12) in Theorem 2.9, to compute one should have done an -dimensional numerical integration, independently on the previous iteration.
Stopping conditions.
Since the numerical scheme is iterative and since an exact solution is not available, we adopt a consecutive-iterations stopping condition. At every step we measure the difference between consecutive iterations and stop when this difference is below a certain threshold . Specifically we adopt two strategies in different situations: when we compute the entire trajectory of for , we measure
[TABLE]
and stop the iterations if (see Figures 1 and 2); when we are interested only in for a fixed , then
[TABLE]
and adopt the same stopping rule (Figures 6 and 7).
The entire procedure can be summarized in the following scheme:
3.2 Examples
Here we collect the results obtained, and all the parameters involved in the simulations. Parameters are divided into two categories: those related to the mathematical problem, and those strictly related to the numerical approximations, see Tables 2 and 2. Those are our reference parameters: we will specify each time any modifications.
In all the figures below, when showing the entire trajectory of the solution for , we also plot the [math]-th order iteration. This corresponds to the solution of the linear case for (1.2), i.e. the Kolmogorov equation with . This will allow us to compare with the linear case, in order to be sure to have introduced a significant nonlinearity into the problem.
Mixed-time-step strategy.
To perform numerical simulation of SDEs and numerical integration we adopt a mixed-time-step strategy. When we compute the reference solution, through the simulation of the process , as well as when computing samples of the linear process , we adopt a time step . On the other hand when we perform numerical integration, to compute successive iterations, we adopt a time discretization parameter , see Table 2. This is due to the fact that, in equation (1.1), as well as in (3.5), a coefficient is present in the -th component of the drift of the equation. This coefficient, and hence the Lipschitz constant of the drift, is growing as the square of the dimension of the problem. This is caused by the intrinsic exponential decay of equation (3.5), which require a high level of precision in computation. Differently, in equation (3.7), part of this exponential decay is absorbed by the convolutional structure of the integration. The limits and what is the proper ratio between and is a difficult topic. A more precise investigation is needed: for the present paper we only highlight the numerical result obtained, and hope to improve the theoretical counterpart in a future work.
Positive results.
For the simpler test case, the sine case (3.1), see Figure 1, convergence is obtained in five iterations. This is due to the simplicity of the example, as is almost linear near the origin. The situation is different when dealing with some more concrete examples like the polynomial case. In Figure 2, where we use formula (3.3) with , we see that the number of iterations to convergence is much bigger (26 in our example). At the same time the difference between the last iteration and the reference case is quite small, comparable to the sine case. However, we notice that the oscillation of the solution computed via our iteration scheme, related to the variance of the estimator, is a bit bigger than that of the reference case. This discrepancy is not completely clear yet, even if we expect it to be due to the low number of samples used to compute averages. In Figure 2 we also add a moving-average smoothing of the solution, to make more perceivable this last intuition.
The same behavior is obtained in the variations of the previous examples. In Figure 3 we see that the same fast convergence as in the sine case, is obtained also in the sine times skew-symmetric case (3.2). The Polynomial cubic case (3.3) with has the same level of complexity as the case with , even if it requires a higher number of iterations to obtain convergence, and presents the same type of oscillations.
We also perform the same tests in much higher dimension. In Figure 4 we show the results of the same examples, performed in dimension with samples. We see that the number of iterations required to convergence are comparable with result in : this confirms the estimate of Corollary 2.13 which is in the infinite dimensional framework and hence is independent of any dimension. The small variations in the number of iterations, as well as the slight increase of the oscillations in the quadratic case, can be explained by the reduction in the number of samples used to compute empirical averages. It is also important to remark that, in the current example, the estimate of Corollary 2.13 is still too rough: by computing the right-hand side of (2.16) one finds that the number of iterations needed to have is far bigger than what we find in the numerical test (in fact it should be bigger than one hundred).
In Figure 6 we followed a different approach: we fix the test case as the sine times skew-symmetric matrix (3.2), and analyze what is the limit of as goes to zero. Also in this case the solution computed through the iteration scheme is quite close to the reference case. At the same time, on the right side of Figure 6, we can appreciate the great advantage in time-saving of the iteration scheme. We remark that the plot on the right side is cumulative, meaning that it takes into account the time spent to compute the solution multiple times. In particular, we note that the reference case is a straight line, since the computational time does not depend on the different values of . On the other hand, for the iteration scheme there is a change in the number of iterations for different values of that justifies the nonlinear shape. Moreover we see that, even including the time of computing samples of the process that can be done only once (since does not appear in (3.5)), we still have a great advantage in time.
As remarked in the introduction, this kind of advantage is a main feature of the new method proposed here and applies also to the variation of other parameters than . In particular, it applies to the change of initial conditions , one of the most fundamental problems in weather and climate prediction, related to the ensemble forecasting method, see [15, Chapter 6]. Again, Monte Carlo pays linearly with the number of variations of , while our method pays the bulk (i.e. in (3.5)) only once and then (here for the initial conditions) roughly linearly in the number of different ’s, but with a linear slope much smaller than the one of Monte Carlo, similarly to the initial slope of Figure 6 right side. We illustrate the interest in varying by Figure 7 right side, where it is illustrated the relative importance of different variations.
Difficulties with small and high dimension.
However, not every situation is well behaved as those presented above: in Figure 7 left side, we present the plot for different values of in the polynomial quadratic case. Here the approximation tends to degenerate for smaller values of (already around ). This is due to the higher level of nonlinearity of the polynomial case with respect to (3.2). It is also important to mention that the number of iterations to convergence is really important for what concerns the computational time. In the polynomial quadratic (and also cubic) case, since the number of iterations to convergence is much higher than in the simpler case, the advantage in the computational time is less relevant. Still for what concerns negative results we also show in Figure 5 that, when the dimension grows (left , right ), the sine times skew-symmetric case (3.2) tends to degenerate. Iterations are still converging but the limit is far form the reference solution. This is definitively the most difficult of our examples since it is the only one which mixes strongly all the components and produces a strong energy flux between them. However we also remark that at present time this case is not covered by our theory, but is still relevant since it has the rotational behavior which appeals to fluid dynamics.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] D. Barbato, L. A. Bianchi, F. Flandoli, F. Morandin, A dyadic model on a tree. J. Math. Phys. 54 (2013), no. 2, 021507, 20 pp.
- 2[2] C. Beck, S. Becker, P. Grohs, N. Jaafari, A. Jentzen, Solving stochastic differential equations and Kolmogorov equations by means of deep learning, ar Xiv:1806.00421.
- 3[3] L. A. Bianchi, Uniqueness for an inviscid stochastic dyadic model on a tree. Electron. Commun. Probab. 18 (2013), no. 8, 12 pp.
- 4[4] L. A. Bianchi, F. Morandin, Structure function and fractal dissipation for an intermittent inviscid dyadic model. Comm. Math. Phys. 356 (2017), no. 1, 231–260.
- 5[5] S. Brenner, R. Scott, The mathematical theory of finite element methods (Vol. 15), Springer Science Business Media, 2007.
- 6[6] S. Cerrai, Second Order PDE’s in Finite and Infinite Dimensions. A Probabilistic Approach. Lecture Notes in Mathematics, 1762. Springer-Verlag, Berlin , 2001.
- 7[7] N. Chen, A. J. Majda, Efficient statistically accurate algorithms for the Fokker-Planck equation in large dimensions. Journal of Computational Physics . 354 (2018), 242–268.
- 8[8] G. Da Prato, Kolmogorov Equations for Stochastic PD Es. Advanced Courses in Mathematics. CRM Barcelona. Birkhäuser Verlag, Basel , 2004.
