Error estimates of the backward Euler-Maruyama method for multi-valued stochastic differential equations
Monika Eisenmann, Mih\'aly Kov\'acs, Raphael Kruse, Stig Larsson

TL;DR
This paper establishes error estimates for the backward Euler-Maruyama method applied to multi-valued stochastic differential equations, including those with non-smooth convex potentials, demonstrating convergence of order at least 1/4.
Contribution
It provides the first rigorous error analysis for the backward Euler-Maruyama method in the context of multi-valued SDEs with non-smooth potentials, extending existing techniques.
Findings
Method is well-defined and converges with order at least 1/4
Applicable to stochastic gradient flows with discontinuous gradients
Verified on overdamped Langevin and stochastic p-Laplace equations
Abstract
In this paper, we derive error estimates of the backward Euler-Maruyama method applied to multi-valued stochastic differential equations. An important example of such an equation is a stochastic gradient flow whose associated potential is not continuously differentiable, but assumed to be convex. We show that the backward Euler-Maruyama method is well-defined and convergent of order at least with respect to the root-mean-square norm. Our error analysis relies on techniques for deterministic problems developed in [Nochetto, Savar\'e, and Verdi, Comm.\ Pure Appl.\ Math., 2000]. We verify that our setting applies to an overdamped Langevin equation with a discontinuous gradient and to a spatially semi-discrete approximation of the stochastic -Laplace equation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Error estimates of
the backward Euler–Maruyama method for
multi-valued stochastic differential equations
Monika Eisenmann
Monika Eisenmann
Technische Universität Berlin
Institut für Mathematik, Secr. MA 5–3
Straße des 17. Juni 136
DE–10623 Berlin
Germany
,
Mihály Kovács
Mihály Kovács
Faculty of Information Technology and Bionics
Pázmány Péter Catholic University
P.O. Box 278, Budapest
Hungary
,
Raphael Kruse
Raphael Kruse
Technische Universität Berlin
Institut für Mathematik, Secr. MA 5–3
Straße des 17. Juni 136
DE–10623 Berlin
Germany
and
Stig Larsson
Stig Larsson
Department of Mathematical Sciences
Chalmers University of Technology and University of Gothenburg
SE–412 96 Gothenburg
Sweden
Abstract.
In this paper, we derive error estimates of the backward Euler–Maruyama method applied to multi-valued stochastic differential equations. An important example of such an equation is a stochastic gradient flow whose associated potential is not continuously differentiable, but assumed to be convex. We show that the backward Euler–Maruyama method is well-defined and convergent of order at least with respect to the root-mean-square norm. Our error analysis relies on techniques for deterministic problems developed in [Nochetto, Savaré, and Verdi, Comm. Pure Appl. Math., 2000]. We verify that our setting applies to an overdamped Langevin equation with a discontinuous gradient and to a spatially semi-discrete approximation of the stochastic -Laplace equation.
Key words and phrases:
multi-valued stochastic differential equation, backward Euler–Maruyama method, strong convergence, stochastic gradient flow, discontinuous drift, Hölder continuous drift
2010 Mathematics Subject Classification:
65C30, 60H10, 34A60
1. Introduction
In this paper, we investigate the numerical approximation of multi-valued stochastic differential equations (MSDE). An important example of such equations is provided by stochastic gradient flows with a convex potential. More precisely, let and be a filtered probability space satisfying the usual conditions. By , , we denote a standard -adapted Wiener process. For instance, let us consider the numerical treatment of nonlinear, overdamped Langevin-type equations of the form
[TABLE]
where , , , and are given. These equations have many important applications, for example, in Bayesian statistics and molecular dynamics. We refer to [10, 22, 23, 44, 49] and the references therein.
We recall that, if the gradient is of superlinear growth, then the classical forward Euler–Maruyama method is known to be divergent in the strong and weak sense, see [18]. This problem can be circumvented by using modified versions of the explicit Euler–Maruyama method based on techniques such as taming, truncating, stopping, projecting, or adaptive strategies, cf. [4, 6, 17, 19, 29, 48].
In this paper, we take an alternative approach by considering the backward Euler–Maruyama method. Our main motivation for considering this method lies in its good stability properties, which allow its application to stiff problems arising, for instance, from the spatial semi-discretization of stochastic partial differential equations. Implicit methods have also been studied extensively in the context of stochastic differential equations with superlinearly growing coefficients. For example, see [1, 15, 16, 30, 31].
The error analysis in the above mentioned papers on explicit and implicit methods typically requires a certain degree of smoothness of such as local Lipschitz continuity. The purpose of this paper is to derive error estimates of the backward Euler–Maruyama method for equations of the form (1.1), where the associated potential is not necessarily continuously differentiable, but assumed to be convex.
For the formulation of the numerical scheme, let be the number of temporal steps, let be the step size, and let
[TABLE]
be an equidistant partition of the interval , where for . The backward Euler–Maruyama method for the Langevin equation (1.1) is then given by the recursion
[TABLE]
where .
An example of a non-smooth potential is found by setting and , , for . Evidently, the gradient of is not locally Lipschitz continuous at for . Moreover, if , then the gradient has a jump discontinuity of the form
[TABLE]
Here, the value at is not canonically determined. We have to solve a nonlinear equation of the form in each step of the backward Euler method (1.3). However, if , then the sole candidate for a solution is , since otherwise . But is only a solution if . Therefore, the mapping is not surjective for any single-valued choice of .
This problem can be bypassed by considering the multi-valued subdifferential of a convex potential , which is given by
[TABLE]
Recall that if the gradient exists at in the classical sense. See [45, Section 23] for further details.
In the above example, one easily verifies that
[TABLE]
This allows us to solve the nonlinear inclusion where we want to find with for any .
For this reason, we study the more general problem of the numerical approximation of multi-valued stochastic differential equations (MSDE) of the form
[TABLE]
Here, we assume that the mappings and are globally Lipschitz continuous. Moreover, the multi-valued drift coefficient function is assumed to be a maximal monotone operator, cf. Definition 2.1 below. See also Section 4 for a complete list of all imposed assumptions on the MSDE (1.5). Let us emphasize that the subdifferential of a proper, lower semi-continuous and convex potential is an important example of a possibly multi-valued and maximal monotone mapping , cf. [45, Corollary 31.5.2].
The backward Euler–Maruyama method for the approximation of the MSDE (1.5) on the partition is then given by the recursion
[TABLE]
We discuss the well-posedness of this method (1.6) under our assumptions on , , and in Section 5. In particular, it will turn out that both problems, (1.5) and (1.6), admit single-valued solutions and , respectively.
The main result of this paper, Theorem 6.4, then states that the backward Euler–Maruyama method is convergent of order at least with respect to the norm in . For the error analysis we rely on techniques for deterministic problems developed in [37]. An important ingredient is the additional condition on that there exists with
[TABLE]
for all and , , . This assumption is easily verified for a subdifferential of a convex potential, cf. Lemma 3.2. As already noted in [37] for deterministic problems, this inequality allows us to avoid Gronwall-type arguments in the error analysis for terms involving the multi-valued mapping .
Before we give a more detailed outline of the content of this paper let us mention that multi-valued stochastic differential equations have been studied in the literature before. The existence of a uniquely determined solution to the MSDE (1.5) has been investigated, e.g., in [7, 21, 41]. We also refer to the more recent monograph [40] and the references therein. In [14, 51] related results have been derived for multi-valued stochastic evolution equations in infinite dimension. The numerical analysis for MSDEs has also been considered in [3, 26, 42, 53, 55]. However, these papers differ from the present paper in terms of the considered numerical methods, the imposed conditions, or the obtained order of convergence.
Further, we also mention that several authors have developed explicit numerical methods for SDEs with discontinuous drifts in recent years. For instance, we refer to [9, 24, 25, 33, 34, 35, 36]. While these results often apply to more irregular drift coefficients, which are beyond the framework of maximal monotone operators, the authors have to employ more restrictive conditions such as the global boundedness of the drift, which is not required in our framework.
This paper is organized as follows: In Section 2 we fix some notation and recall important terminology for multi-valued mappings. In Section 3 we demonstrate how to apply the techniques from [37] to the simplified setting of the Langevin equation (1.1). In addition, we also show that if the gradient is more regular, say Hölder continuous with exponent , then the order of convergence increases to . Moreover, it turns out that the error constant does not grow exponentially with the final time . This is an important insight if the backward Euler method is used within an unadjusted Langevin algorithm [44], which typically requires large time intervals. See Theorem 3.7 and Remark 3.8 below.
In Section 4 we turn to the more general multi-valued stochastic differential equation (1.5). We state all assumptions imposed on the appearing drift and diffusion coefficients and collect some properties of the exact solution. In Section 5 we show that the backward Euler–Maruyama method (1.6) is well-posed under the assumptions of Section 4. In Section 6 we prove the already mentioned convergence result with respect to the root-mean-square norm. Finally, in Section 7 we verify that the setting of Section 4 applies to a Langevin equation with the discontinuous gradient (1.4). Further, we also show how to apply our results to the spatial discretization of the stochastic -Laplace equation which indicates their usability for the numerical analysis of stochastic partial differential equations. However, a complete analysis of the latter problem will be deferred to a future work.
2. Preliminaries
In this section, we collect some notation and introduce some background material. First we recall some terminology for set valued mappings and (maximal) monotone operators. For a more detailed introduction we refer, for instance, to [47, Abschn. 3.3] or [39, Chapter 6].
By , , we denote the Euclidean space with the standard norm and inner product . Let be a set. A set-valued mapping maps each to an element of the power set , that is, . The domain of is given by
[TABLE]
Definition 2.1**.**
Let be a non-empty set. A set-valued mapping is called monotone if
[TABLE]
for all , , and .
Moreover, a set-valued mapping is called maximal monotone if is monotone and for all and satisfying
[TABLE]
it follows that and .
Next, we recall a Burkholder–Davis–Gundy-type inequality. For a proof we refer to [28, Chapter 1, Theorem 7.1]. For its formulation we take note that the Frobenius or Hilbert–Schmidt norm of a matrix is also denoted by .
Lemma 2.2**.**
Let and be stochastically integrable. Then, for every with , the inequality
[TABLE]
holds.
Let us also recall a stochastic variant of the Gronwall inequality. A proof that can be modified to this setting can be found in [54]. Compare also with [50].
Lemma 2.3**.**
Let be -adapted and almost surely continuous stochastic processes such that is a local -martingale with . Moreover, suppose that and are nonnegative. In addition, let be integrable and nonnegative. If, for all , we have
[TABLE]
then, for every , the inequality
[TABLE]
holds.
Moreover, we often make use of generic constants. More precisely, by we denote a finite and positive quantity that may vary from occurrence to occurrence but is always independent of numerical parameters such as the step size and the number of steps .
3. Application to the Langevin equation with a convex potential
In order to illustrate our approach, we first consider a more regular stochastic differential equation with single-valued (Hölder) continuous drift term. More precisely, we consider the overdamped Langevin equation [23, Section 2.2]
[TABLE]
where , , and is a standard -valued Wiener process. In addition, we impose the following assumption on the potential .
Assumption 3.1**.**
Let be a convex, nonnegative, and continuously differentiable function.
In the following, we denote by the gradient of , that is . It is well-known that the convexity of implies the variational inequality
[TABLE]
see, for example, [45, § 23].
In the following lemma, we collect some properties of , which are direct consequences of Assumption 3.1. Both inequalities are well-known. The proof of (3.4) is taken from [37].
Lemma 3.2**.**
Under Assumption 3.1 and with , the inequalities
[TABLE]
and
[TABLE]
are fulfilled for all .
Proof.
The first inequality follows directly from (3.2) since
[TABLE]
for all . For the proof of the second inequality we start by rewriting its left-hand side. For arbitrary we rearrange the terms to obtain
[TABLE]
Setting for all , we see that
[TABLE]
But (3.2) says that for all , which completes the proof. ∎
It follows from Assumption 3.1 and Lemma 3.2 that the drift of the stochastic differential equation (3.1) is continuous and monotone. Therefore, by [43, Thm. 3.1.1] the stochastic differential equation (3.1) has a solution in the strong (probabilistic) sense satisfying -a.s. for all
[TABLE]
Moreover, the solution is unique up to -indistinguishability and it is square-integrable with
[TABLE]
Next, we turn to the numerical approximation of the solution of (3.1). Recall that for a single-valued drift the backward Euler–Maruyama method is given by the recursion
[TABLE]
where , and .
The next lemma contains some a priori estimates for the backward Euler–Maruyama method (3.6).
Lemma 3.3**.**
Let be given and let Assumption 3.1 be satisfied. For an arbitrary step size , , let be a family of -adapted random variables satisfying (3.6). If , then
[TABLE]
and
[TABLE]
Proof.
First, we recall the identity
[TABLE]
Using also (3.6), we then get
[TABLE]
for every . Hence, an application of (3.2) yields
[TABLE]
for every . From applications of the Cauchy–Schwarz inequality and the weighted Young inequality we then obtain
[TABLE]
for every .
The third term on the right-hand side is absorbed in the third term on the left-hand side. Summation then yields
[TABLE]
An inductive argument over then yields that is square-integrable due to the assumption . Therefore, after taking expectation the last sum vanishes. Moreover, an application of the Itō isometry then gives
[TABLE]
Since this is true for any the assertion follows. ∎
As the next theorem shows, Assumption 3.1 is also sufficient to ensure the well-posedness of the backward Euler–Maruyama method. The result follows directly from the fact that is continuous and monotone due to (3.3). For a proof we refer, for instance, to [4, Sect. 4], [38, Chap. 6.4], and [52, Theorem C.2]. The assertion also follows from the more general result in Theorem 5.3 below.
Theorem 3.4**.**
Let as well as be given and let Assumption 3.1 be satisfied. Then, for every equidistant step size , , there exists a uniquely determined family of square-integrable and -adapted random variables satisfying (3.6).
We now turn to an error estimate with respect to the -norm. Since we do not impose any (local) Lipschitz condition on the drift , classical approaches based on discrete Gronwall-type inequalities are not applicable. Instead we rely on an error representation formula, which was introduced for deterministic problems in [37].
For the formulation of this, we introduce the following notation: For a given equidistant partition with step size , we denote by the piecewise linear interpolant of the sequence generated by the backward Euler method (3.6). It is defined by and
[TABLE]
In addition, we introduce the processes , which are piecewise constant interpolants of and defined by and
[TABLE]
Analogously, we define the piecewise linear interpolated process by and
[TABLE]
for all , .
We are now prepared to state Lemma 3.5. The underlying idea of this lemma was introduced in [37], where it is used to derive a posteriori error estimates for the backward Euler method. In fact, in the absence of noise, only the first term on the right-hand side of (3.12) is non-zero. In [37] this term is used as an a posteriori error estimator, since it is explicitly computable by quantities generated by the numerical method.
Lemma 3.5**.**
Let as well as be given and let Assumption 3.1 be satisfied. Let , , be an arbitrary equidistant step size and let , . Then, for every the estimate
[TABLE]
holds, where and are the solutions of (3.1) and (3.6), respectively.
Proof.
From (3.6) we directly deduce that for every
[TABLE]
Then, one easily verifies for all , , that
[TABLE]
Hence, due to (3.5) the error process fulfills
[TABLE]
for all . Here, we have , since is an interpolant of . Hence, for all ,
[TABLE]
To estimate the norm of , we first note that has absolutely continuous sample paths with . Hence,
[TABLE]
is fulfilled for almost all . Therefore, by integration with respect to , we get
[TABLE]
Next, we write
[TABLE]
and use (3.3) and (3.4) to obtain, for almost every , that
[TABLE]
Furthermore, the expectation of the second integral on the right-hand side of (3.15) is equal to
[TABLE]
Therefore,
[TABLE]
Since the assertion follows. ∎
The next lemma contains an estimate of the difference between the Wiener process and its piecewise linear interpolant .
Lemma 3.6**.**
For every and every step size , , the equality
[TABLE]
holds.
Proof.
From the definition (3.11) of it follows that
[TABLE]
where we used that the two increments of the Wiener process are independent for every , , and we also applied Itō’s isometry. By symmetry of the two terms it then follows that
[TABLE]
and the proof is complete. ∎
The error estimates in Lemma 3.5 and Lemma 3.6 allow us to determine the order of convergence of the backward Euler–Maruyama method without relying on discrete Gronwall-type inequalities. The following theorem imposes the additional assumption that the drift is Hölder continuous. We include the parameter value , which simply means that is continuous and globally bounded. The case of less regular is treated in Section 6.
Observe that we recover the standard rate if , that is, if the drift is assumed to be globally Lipschitz continuous. Compare also with the standard literature, for example, [20, Chap. 12] or [32, Sect. 1.3].
For processes and exponents , we define the family of Hölder semi-norms by
[TABLE]
Theorem 3.7**.**
Let as well as be given, let Assumption 3.1 be fulfilled and let be Hölder continuous with exponent , i.e., there exists such that
[TABLE]
Then there exists such that for every step size , , the estimate
[TABLE]
holds, where and are the solutions to (3.1) and (3.6), respectively.
Proof.
Since is assumed to be -Hölder continuous it follows that
[TABLE]
In particular, grows at most linearly. Therefore, as stated in [28, Chap. 2, Thm 4.3], the solution of (3.1) satisfies .
We will use Lemma 3.5 to prove the error bound. To this end, we first show that
[TABLE]
Indeed, we make use of the Hölder continuity of directly and obtain
[TABLE]
where we also used Hölder’s inequality with and as well as Jensen’s inequality. Due to the a priori estimate (3.8) the sum \sum_{i=1}^{N}{\mathbf{E}}\big{[}|X^{i}-X^{i-1}|^{2}\big{]} is bounded independently of the step size . Hence, we arrive at (3.17).
Therefore, it remains to estimate the second error term in Lemma 3.5:
[TABLE]
where we inserted the definition of from (3.10). Moreover, from (3.11) we get
[TABLE]
for . Hence, the random variable in the second slot of the inner product on the right-hand side (3.18) is centered and is independent of any -measurable random variable. Thus, we may write
[TABLE]
To estimate we first recall the definitions of and from (3.10). Then we apply the Cauchy–Schwarz inequality and obtain
[TABLE]
From the Hölder continuity of we then deduce that
[TABLE]
where the last inequality is in fact an equality if , or if , . Otherwise the inequality follows from Hölder’s inequality with and , followed by an application of Jensen’s inequality. Furthermore, Lemma 3.6 states that
[TABLE]
Therefore, together with (3.8) we arrive at the estimate
[TABLE]
for all .
The estimate of works similarly by additionally making use of the Hölder continuity of the exact solution. To be more precise, we have that
[TABLE]
Together with the Cauchy–Schwarz inequality and (3.19), we therefore obtain
[TABLE]
Inserting the estimates for , and (3.17) into Lemma 3.5 completes the proof. ∎
Remark 3.8**.**
The precise form of the constant appearing in Theorem 3.7 is, after taking squares,
[TABLE]
with .
Observe that, since we avoid the use of Gronwall-type inequalities, the error constant does not grow exponentially with time . This indicates that the backward Euler–Maruyama method is particularly suited for long-time simulations as is often required in Markov-chain Monte Carlo methods, for example, in the unadjusted Langevin algorithm [44].
4. Properties of the exact solution
In this section, we turn our attention to the multi-valued stochastic differential equation (MSDE) in (1.5). We give a complete account of the assumptions imposed on the coefficient functions. In addition, we collect some results on the existence and uniqueness of a strong solution to the MSDE. We also include useful results on higher moment bounds of the exact solution.
Assumption 4.1**.**
The set valued mapping is maximal monotone with . Moreover, there exist constants , , and such that
[TABLE]
for every and .
Assumption 4.2**.**
The function is Lipschitz continuous; i.e., there exists a constant such that
[TABLE]
for all .
Assumption 4.3**.**
The function is Lipschitz continuous; i.e., there exists a constant such that
[TABLE]
for all .
Assumption 4.4**.**
The initial value is an -measurable and -valued random variable. Furthermore,
[TABLE]
where the value of is the same as in Assumption 4.1.
Observe that Assumptions 4.2 and 4.3 directly imply that and grow at most linearly. More precisely, after possibly increasing the values of and , we obtain the bounds
[TABLE]
for all .
Remark 4.5**.**
Without loss of generality we will assume that . Otherwise, since the graph of is not empty, we take and and replace , , and by suitably shifted mappings, for instance, . Then holds. Compare further with [47, Abschn. 3.3.3].
Next, we introduce the notion of a solution of (1.5), which we use for the remainder of this paper.
Definition 4.6**.**
A tuple is called a solution of the multi-valued stochastic differential equation (1.5), if the following conditions hold.
- (i)
The mapping is an -adapted, almost surely continuous stochastic process such that for all with probability one.
- (ii)
The mapping is an -adapted stochastic process such that
[TABLE]
- (iii)
The equality
[TABLE]
holds for all and -almost surely.
- (iv)
For almost all and , it follows that ; in other words, for every and the inequality
[TABLE]
is satisfied for almost every and -almost surely, cf. Definition 2.1.
This notion of a solution has been considered in, for example, [7], [21], [41], and [51], where also the existence of a unique solution is shown. Due to their importance for the error analysis, we next prove certain moment estimates.
Theorem 4.7**.**
Let Assumptions 4.1 and 4.4 be satisfied with . Then there exists a unique solution of (1.5) in the sense of Definition 4.6. There is a constant such that
[TABLE]
Furthermore, if and , then
[TABLE]
Proof.
Existence and uniqueness is shown, for instance, in [21]. For
[TABLE]
the equality
[TABLE]
holds by an application of Itō’s formula (see [12, Chap. 4.7, Theorem 7.1]). From the coercivity assumption on we obtain that
[TABLE]
for every and almost every . The fact that for almost every then implies that
[TABLE]
Since and satisfies the linear growth bound (4.1), we have
[TABLE]
as well as
[TABLE]
Thus, we get
[TABLE]
We introduce
[TABLE]
Then , , and are -adapted and almost surely continuous stochastic processes. Furthermore, is a local -martingale with . Thus, an application of Lemma 2.3 yields, for every , that
[TABLE]
Inserting the definition of then proves the first estimate.
Furthermore, if Assumption 4.1 holds with , then we have, for every , , that
[TABLE]
with . Therefore, it follows that
[TABLE]
since for almost every . ∎
Remark 4.8**.**
Let us mention that, for instance, in [40, Chapter 4] and the references therein, a weaker notion of a solution to (1.5) is found. More precisely, if is a solution in the sense of Definition 4.6, then is a solution in the sense of [40, Chapter 4] with the definition
[TABLE]
In particular, the process is a continuous, progressively measurable process with bounded total variation and almost surely. The stronger condition of absolute continuity of the process , which is required in Definition 4.6, is essential in the proof of Theorem 6.4 below. This explains why we work with the stronger notion of a solution in Definition 4.6.
5. Well-posedness of the backward Euler method
In this section, we show that the backward Euler–Maruyama method (1.6) for the MSDE (1.5) is well-posed under the same assumptions as in the previous section.
Lemma 5.1**.**
Let Assumptions 4.1 and 4.2 be satisfied. Furthermore, let and be given with . Then there exist uniquely determined and , which satisfy the nonlinear equation
[TABLE]
Proof.
We first show that there exists a unique such that
[TABLE]
To this end, notice that for all , the inequalities
[TABLE]
hold due to the step-size bound. In addition, it follows from (4.1) that
[TABLE]
for all . Hence, is the sum of the maximal monotone operator and the mapping , which is single-valued, Lipschitz continuous, monotone and coercive.
Thus, we can apply [2, Theorem 2.1] and obtain the existence of such that (5.2) holds. Furthermore, there necessarily exists a corresponding unique element with
[TABLE]
It remains to prove the uniqueness of , which directly implies the uniqueness of . Assume that there exist and as well as and such that
[TABLE]
By considering the difference of these equations tested with , we obtain
[TABLE]
Since we must have and the proof is complete. ∎
For later use, we note that the solution operator for (5.1) is Lipschitz continuous.
Lemma 5.2**.**
Let Assumptions 4.1 and 4.2 be satisfied. For with let be the solution operator that maps to the unique solution of (5.1). Then is globally Lipschitz continuous with
[TABLE]
Proof.
Let and with be given. Let and , , denote the unique solutions of the equations
[TABLE]
By considering the difference of these equations, tested with , we obtain
[TABLE]
By using the Cauchy–Schwarz inequality for the right-hand side as well as the monotonicity and the Lipschitz continuity for the left-hand side, we get
[TABLE]
Reinserting then shows that
[TABLE]
as claimed. ∎
Theorem 5.3**.**
Let Assumptions 4.1 to 4.4 be satisfied. Then for every step size , , with there exist uniquely determined families of square-integrable, -valued and -adapted random variables and such that , for every and
[TABLE]
for every , -almost surely.
Proof.
We prove the existence of and by induction over . From the assumptions on and it is clear that and are -adapted and square-integrable. In particular, it follows from Assumptions 4.1 and 4.4 that
[TABLE]
Next, we assume that and are -adapted, square-integrable and satisfy (5.3) for all . By Lemma 5.1 there exist uniquely determined and for almost every such that
[TABLE]
By Lemma 5.2, the solution operator that maps to is Lipschitz continuous. As is Lipschitz continuous and, hence, of linear growth it follows that is an -measurable and square-integrable random variable. To be more precise, we have the bound
[TABLE]
This implies, in particular, that
[TABLE]
is also a -measurable and square-integrable random variable as , and have these properties. This finishes the proof of the induction and hence that of the theorem. ∎
Next we state an a priori estimate for the sequence of random variables satisfying recursion (1.6).
Lemma 5.4**.**
Let Assumptions 4.1 to 4.4 be satisfied. For a step size , , with , let and be two families of -adapted random variables as stated in Theorem 5.3. Then there exists independent of the step size such that
[TABLE]
In addition, if , then there exists independent of the step size such that
[TABLE]
where is given by .
Remark 5.5**.**
If in Assumption 4.1, then and, hence, are bounded. In particular, (5.5) holds for any and for any step size with .
Proof of Lemma 5.4.
First, we recall the identity
[TABLE]
As , using Assumptions 4.1 and 4.2, it follows that
[TABLE]
where we also applied (4.1). Hence,
[TABLE]
for every , where we also applied the Cauchy–Schwarz and weighted Young inequalities. After a kick-back, we sum from to to obtain
[TABLE]
After taking expectations, the last term on the right-hand side vanishes. Then, applications of Itō’s isometry and (4.1) give
[TABLE]
Since the step-size bound ensures that
[TABLE]
the discrete Gronwall inequality (see, for example, [8]) is applicable and completes the proof of (5.4). Finally, it follows from the polynomial growth bound on that
[TABLE]
and an application of (5.4) then yields (5.5). ∎
6. Error estimates in the general case
In this section, we derive an error estimate for the backward Euler method given by (1.6) for the MSDE (1.5).
To prove the convergence of the scheme (1.6), let us fix some notation. Throughout this section, we assume that the equidistant step size is small enough so that the a priori estimates in Lemma 5.4 hold. Further, as in (3.9) and (3.10), we denote the piecewise linear interpolants of the discrete values by , for and
[TABLE]
for all and . Similarly, we define the piecewise constant interpolant by and
[TABLE]
for all and . Moreover, we introduce the stochastic processes and defined by
[TABLE]
as well as by and, for all and ,
[TABLE]
In view of (1.6) and the definition of for , , we obtain the representation
[TABLE]
We begin the derivation of our error estimate by considering the difference between the stochastic integral and its approximation .
Lemma 6.1**.**
Let Assumptions 4.1 to 4.4 be satisfied. Then there exists such that, for every equidistant step size , with and every , we have
[TABLE]
In addition, for every , there exists such that, for every and , the following estimates hold:
[TABLE]
and
[TABLE]
Proof.
Recall the definitions of and from (6.1) and (6.2). First, we add and subtract a term and then apply the triangle inequality. Then, for every and we arrive at
[TABLE]
by an application of Itō’s isometry. Furthermore, due to the Lipschitz continuity of we obtain
[TABLE]
where the last step follows from the identity
[TABLE]
which holds for every , . Finally, it follows from the same arguments as in the proof of Lemma 3.6 and by (4.1) for every that
[TABLE]
Together with the a priori bounds from Lemma 5.4 this shows (6.4).
It remains to prove the estimates (6.5) and (6.6). For (6.5) we first apply the Burkholder–Davis–Gundy-type inequality from Lemma 2.2 with constant and obtain for every and that
[TABLE]
where we also made use of the linear growth bound (4.1) in the last step. This proves (6.5). The bound in (6.6) can be shown by analogous arguments. ∎
The next lemma generalizes an important estimate from the proof of Theorem 3.7 to the multi-valued setting. In particular, we refer to Lemma 3.5 and (3.17).
Lemma 6.2**.**
Let Assumptions 4.1 to 4.4 be satisfied. For every step size , , with , let the families and of random variables be as stated in Theorem 5.3. Then there exists independent of the step size such that
[TABLE]
Proof.
The nonnegativity follows immediately from the monotonicity of . To prove the second inequality, we insert the scheme (5.3) and obtain
[TABLE]
For (6.7) we obtain
[TABLE]
because of the telescopic structure. Furthermore, it follows from Assumptions 4.1 and 4.4 that
[TABLE]
For (6.8) we apply Hölder’s inequality with and to obtain
[TABLE]
Then, from applications of the triangle inequality and Lemma 5.4, we get
[TABLE]
We apply the polynomial growth bound satisfied by and see that, for ,
[TABLE]
is fulfilled, while for we have
[TABLE]
In both cases the appearing terms are finite because of Assumption 4.4. Moreover, a further application of the triangle inequality yields
[TABLE]
Due to the linear growth bound (4.1) on and the a priori bound (5.4), it then follows that
[TABLE]
By application of Lemma 2.2 with constant , we obtain
[TABLE]
Together with the linear growth bound (4.1) on , this shows that
[TABLE]
Putting the estimates together proves the desired bound. ∎
We are now prepared to state and prove the main result of this section. While the main ingredients of the proof still consist of techniques introduced in [37, Sect. 4] for deterministic problems, the proof is somewhat more technical than the proof of Theorem 3.7. In particular, due to the presence of Lipschitz perturbations in the general problem (1.5) it is no longer possible to avoid an application of a Gronwall lemma. Moreover, as in [37, Sect. 4] we impose the following additional assumption on the multi-valued mapping .
Assumption 6.3**.**
There exists such that, for every , , , and ,
[TABLE]
In Lemma 3.2, we already proved that, if is the subdifferential of a convex potential, then Assumption 6.3 is satisfied with . For a further example, we refer to Section 7.
Theorem 6.4**.**
Let Assumptions 4.1 – 4.4 and Assumption 6.3 be satisfied. Let the step size , , be such that . Then there exists a constant independent of such that
[TABLE]
Proof.
Let us first introduce some additional notation. We will denote the error between the exact solution to (1.5) and the numerical approximation defined in (6.3) by , . Furthermore, it will be convenient to split the error into two parts
[TABLE]
where we define
[TABLE]
-almost surely for every . We expand the square of the norm of as
[TABLE]
In order to estimate the terms on the right-hand side of (6.11), we first observe in (6.9) that has absolutely continuous sample paths with . Hence we have for almost every . Therefore, after integrating from [math] to we get
[TABLE]
Furthermore, we also have
[TABLE]
Thus, after combining (6.12) and (6.13) we obtain
[TABLE]
For the first integral on the right-hand side of (6.14) we insert the derivative of and the definition of the error process . This yields, for almost every ,
[TABLE]
After recalling the definition of we use Assumptions 4.1 and 6.3. Then, for almost every and all , we get
[TABLE]
where the second term in the last step is non-positive due to the monotonicity of (cf. Definition 2.1). Moreover, because of the Lipschitz continuity of , we have for almost every that
[TABLE]
where we also made use of Young’s inequality. In addition, for every and , we have that
[TABLE]
Therefore,
[TABLE]
Altogether, for every and , we have shown that
[TABLE]
where we also inserted that as well as . It follows that, for every and ,
[TABLE]
Hence, together with Lemma 5.4 and Lemma 6.2 this shows that
[TABLE]
Next, we give an estimate for the second integral on the right-hand side of (6.14). For every and we decompose the integral as follows
[TABLE]
For every we then add and subtract in the second slot of the inner product in the first term on the right-hand side of (6.16). This gives
[TABLE]
After inserting the definition of from (6.10) the first integral is then equal to
[TABLE]
for all , , and . Since is square-integrable and -measurable it therefore follows that
[TABLE]
for all , and . Hence, after taking expectations in (6.16) we arrive at
[TABLE]
Inserting the definitions (6.9) and (6.10) of and and applying Hölder’s inequality with and , we get
[TABLE]
In the following, we will estimate , , , and separately. For we obtain after an application of Hölder’s inequality for sums that
[TABLE]
If then and . In this case all integrals appearing are finite due to the bounds in Theorem 4.7 and Lemma 5.4. Moreover, if then . Then it follows from further applications of Hölder’s inequality and Jensen’s inequality that
[TABLE]
as well as
[TABLE]
Hence, we arrive at the same conclusion. If then the processes and are globally bounded due to the bound on in Assumption 4.1. Using Lemma 6.1 we see that
[TABLE]
Altogether, this yields
[TABLE]
for a suitable constant , which is independent of . To estimate we argue analogously as in the case for to obtain that
[TABLE]
The first factor is bounded as we saw in the case for . Furthermore, using Lemma 6.1, we have that
[TABLE]
Due to the a priori bound (5.4), it follows that there exists a constant , which does not depend on such that
[TABLE]
The estimates and follow analogously with the only new term that appears is of the form
[TABLE]
which is bounded due to Theorem 4.7 and the a priori bound (5.4). Therefore, there exist constants such that
[TABLE]
Hence, we obtain
[TABLE]
After taking expectations in (6.11) and inserting (6.14), (6.15), (6.17) as well as (6.4) from Lemma 6.1, we obtain for every that
[TABLE]
The assertion then follows from an application of Gronwall’s lemma, see for example, [11, Appendix B]. ∎
Remark 6.5**.**
Up to this point, we only proved convergence for but not for . However, from the existence of we also obtain that
[TABLE]
Analogously, we can write for the exact solution that
[TABLE]
Therefore, from the convergence of to and the Lipschitz continuity of and we also obtain the estimate
[TABLE]
for every .
7. Examples
7.1. Discontinuous drift coefficient
In this example, we show that Assumption 4.1 includes overdamped Langevin-type equations with a possibly discontinuous drift . We consider the convex, nonnegative, yet not continuously differentiable function , , which has a multi-valued subdifferential defined by
[TABLE]
This mapping fulfills Assumption 4.1 for . To be more precise, is a monotone function and there exists no proper monotone extension of its graph. In fact, the subdifferential of any proper, lower semi-continuous and convex function is a maximal monotone mapping by a well-known theorem of Rockafellar, cf. [45, Cor. 31.5.2] or [47, Satz 3.23].
Furthermore, we notice that as well as for every and . This shows that fulfills all the conditions of Assumption 4.1. It remains to verify Assumption 6.3. Since is the subdifferential of the variational inequality (3.2) is still satisfied in the sense that
[TABLE]
for all and . Following the same steps as in the proof of Lemma 3.2 but replacing , , and by arbitrary elements , , and , respectively, shows that Assumption 6.3 is fulfilled. Therefore, the backward Euler–Maruyama method (1.6) is well-defined and yields an approximation of the exact solution of
[TABLE]
where and are Lipschitz continuous and . To be more precise, the piecewise linear interpolant of the values defined in (6.3) fulfills
[TABLE]
for that does not depend on the step size . However, let us mention that the strong order of convergence of is not necessarily optimal in this particular example. We refer the reader to [9] for a corresponding result on the forward Euler–Maruyama method.
7.2. Stochastic -Laplace equation
As a second example, we consider the discretization of the stochastic -Laplace equation. A similar setting is studied in [5]. For a more detailed introduction to this class of problems, we refer the reader to this work and the references therein.
For and the stochastic -Laplace equation is given by
[TABLE]
where , , is a bounded Lipschitz domain. By , , we denote a standard -adapted Wiener process. We also assume that the initial value fulfills
[TABLE]
Furthermore, let be a Lipschitz continuous mapping, where denotes the space of Hilbert–Schmidt operators from to . Note that the Nemytskii operator , given by for , is also Lipschitz continuous and will be of importance in the weak formulation below.
Further, let be the Sobolev space of weakly differentiable and -fold integrable functions on with vanishing trace on the boundary , see [46, Section 1.2.3] or [39, Section 4.5] for a precise definition. The dual space of is denoted by in the following. Then, the stochastic -Laplace equation (7.1) has a solution which is progressively measurable and an element of . For further details we refer to [27, Example 4.1.9, Theorem 4.2.4].
For a spatial discretization of (7.1), we use a family of finite element spaces such that for every . Hereby, we interpret as a spatial refinement parameter. In the following, we consider a fixed parameter value . By we then denote the dimension of the space .
The spatially semi-discrete problem is to find a progressively measurable stochastic process in the space such that
[TABLE]
for every and . Hereby, is the -orthogonal projection onto .
In order to apply our results from the previous sections, we rewrite (7.3) as a problem in . To this end, we consider a one-to-one relation between and given by
[TABLE]
for a basis of . Through (7.4) we induce additional norms on which are given by
[TABLE]
for every . Observe that the norm is also induced by the inner product
[TABLE]
where the mass matrix is symmetric and positive definite. Since all norms on are equivalent, for each there exists such that
[TABLE]
for all .
The -Laplace operator in the spatially semi-discrete problem (7.3) can be written as which is implicitly defined by
[TABLE]
for all . By the same arguments as in [27, Example 4.1.9] one can easily verify that fulfills
[TABLE]
for all . Then, for and associated , we introduce mappings and implicitly by
[TABLE]
for and use these functions to define as well as for every . As we assumed that is Lipschitz continuous, there exists such that
[TABLE]
for and fulfilling (7.4) and an orthonormal basis of . Thus, fulfills Assumption 4.3. Due the integrability condition to (7.2) for , it follows that fulfills Assumption 4.4.
Moreover, we see that is monotone, coercive, and bounded as we can write
[TABLE]
as well as
[TABLE]
and
[TABLE]
for all and fulfilling (7.4). Here, denotes the matrix norm in which is induced by . Therefore, Assumption 4.1 is satisfied. To prove that fulfills Assumption 6.3 we note that the mapping given by
[TABLE]
is a potential of , compare [46, Example 4.23]. Since is convex it follows that
[TABLE]
where we use [13, Kapitel III, Lemma 4.10]. In the same way as in Lemma 3.2 we obtain that
[TABLE]
for all . Applying the definition of , we then get
[TABLE]
for and fulfilling (7.4). This shows that also fulfills Assumption 6.3.
Consequently, the results of the previous sections are applicable. More precisely, the backward Euler scheme (1.6) has a unique solution (cf. Theorem 5.3). Theorem 6.4 then states that the piecewise linear interpolant of the values defined in (6.3) fulfills
[TABLE]
for that does not depend on the step size where is the solution to the single-valued stochastic differential equation
[TABLE]
Observe that our proof does not yet rule out that the constant above depends on the dimension of the finite element space . Hence, this is not a complete analysis of a full discretization of the stochastic partial differential equation (7.1) and a more detailed analysis is subject to future work. We refer to [5] for a related result in this direction.
Let us emphasize that, unlike the results in [5], we do not have to impose any temporal regularity assumption on the exact solution of (7.1) or on the solution of the semi-discrete problem (7.3). Since such regularity conditions are often not easily verified for quasi-linear stochastic partial differential equations we are confident that our approach could lead to interesting new insights in the numerical analysis of such infinite dimensional problems.
Acknowledgment
ME would like to thank the Berlin Mathematical School for the financial support. RK also gratefully acknowledges financial support by the German Research Foundation (DFG) through the research unit FOR 2402 – Rough paths, stochastic partial differential equations and related topics – at TU Berlin.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] A. Andersson and R. Kruse. Mean-square convergence of the BDF 2-Maruyama and backward Euler schemes for SDE satisfying a global monotonicity condition. BIT Numer. Math. , 57(1):21–53, 2017.
- 2[2] V. Barbu. Nonlinear Differential Equations of Monotone Types in Banach Spaces . Springer Monographs in Mathematics. Springer, New York, 2010.
- 3[3] F. Bernardin. Multivalued stochastic differential equations: convergence of a numerical scheme. Set-Valued Anal. , 11(4):393–415, 2003.
- 4[4] W.-J. Beyn, E. Isaak, and R. Kruse. Stochastic C-stability and B-consistency of explicit and implicit Euler-type schemes. J. Sci. Comput. , 67(3):955–987, 2016.
- 5[5] D. Breit and M. Hofmanová. Space-time approximation of stochastic p 𝑝 p -Laplace systems. Ar Xiv Preprint, ar Xiv:1904.03134 , 2019.
- 6[6] N. Brosse, A. Durmus, É. Moulines, and S. Sabanis. The tamed unadjusted Langevin algorithm. Stochastic Process. Appl. , 2018. (in press).
- 7[7] E. Cépa. Équations différentielles stochastiques multivoques. In Séminaire de Probabilités, XXIX , volume 1613 of Lecture Notes in Math. , pages 86–107. Springer, Berlin, 1995.
- 8[8] D. S. Clark. Short proof of a discrete Gronwall inequality. Discrete Appl. Math. , 16(3):279–281, 1987.
