Donsker-Type Theorem for BSDEs: Rate of Convergence

Philippe Briand; Christel Geiss; Stefan Geiss; C\'eline Labart

arXiv:1908.01188·math.PR·August 6, 2019

Donsker-Type Theorem for BSDEs: Rate of Convergence

Philippe Briand, Christel Geiss, Stefan Geiss, C\'eline Labart

PDF

TL;DR

This paper investigates the convergence rate of a Markovian backward stochastic differential equation (BSDE) approximation driven by a scaled random walk, extending Donsker-type theorems to BSDEs and analyzing their Wasserstein distance convergence.

Contribution

It introduces a Donsker-type theorem for BSDEs, providing a quantitative rate of convergence for approximations driven by scaled random walks.

Findings

01

Establishes a convergence rate in Wasserstein distance for BSDE approximations.

02

Extends classical Donsker theorems to the context of BSDEs.

03

Provides theoretical bounds for approximation accuracy.

Abstract

In this paper, we study in the Markovian case the rate of convergence in the Wasserstein distance of an approximation of the solution to a BSDE given by a BSDE which is driven by a scaled random walk as introduced in Briand, Delyon and M{\'e}min (Electron. Comm. Probab. 6(2001),1-14).

Equations472

Y_{t} = G (B) + \int_{t}^{T} f (B_{s}, Y_{s}, Z_{s}) d s - \int_{t}^{T} Z_{s} d B_{s}, 0 \leq t \leq T,

Y_{t} = G (B) + \int_{t}^{T} f (B_{s}, Y_{s}, Z_{s}) d s - \int_{t}^{T} Z_{s} d B_{s}, 0 \leq t \leq T,

Y_{t}^{n} = G (B^{n}) + \int_{] t, T]} f (B_{s -}^{n}, Y_{s -}^{n}, Z_{s}^{n}) d ⟨ B^{n} ⟩_{s} + \int_{] t, T]} Z_{s}^{n} d B_{s}^{n}, 0 \leq t \leq T,

Y_{t}^{n} = G (B^{n}) + \int_{] t, T]} f (B_{s -}^{n}, Y_{s -}^{n}, Z_{s}^{n}) d ⟨ B^{n} ⟩_{s} + \int_{] t, T]} Z_{s}^{n} d B_{s}^{n}, 0 \leq t \leq T,

B_{t}^{n} = T / n k = 1 \sum [n t / T] ξ_{k}, 0 \leq t \leq T,

B_{t}^{n} = T / n k = 1 \sum [n t / T] ξ_{k}, 0 \leq t \leq T,

\partial_{t} u (t, x) + \frac{1}{2} Δ u (t, x) + f (t, x, u (t, x), \nabla u (t, x)) = 0, (t, x) \in [0, T [\times R, u (T, \cdot) = g,

\partial_{t} u (t, x) + \frac{1}{2} Δ u (t, x) + f (t, x, u (t, x), \nabla u (t, x)) = 0, (t, x) \in [0, T [\times R, u (T, \cdot) = g,

Y_{s} = u (s, B_{s}) and Z_{s} = \nabla u (s, B_{s}) .

Y_{s} = u (s, B_{s}) and Z_{s} = \nabla u (s, B_{s}) .

W_{r} (Y_{t}^{n}, Y_{t}) \leq C_{r} n^{- (α \land \frac{ε}{2})} and W_{r} (Z_{t}^{n}, Z_{t}) \leq \frac{C _{r}}{T - t} n^{- (α \land \frac{ε}{2})}

W_{r} (Y_{t}^{n}, Y_{t}) \leq C_{r} n^{- (α \land \frac{ε}{2})} and W_{r} (Z_{t}^{n}, Z_{t}) \leq \frac{C _{r}}{T - t} n^{- (α \land \frac{ε}{2})}

Y_{t} = g (B_{T}) + \int_{t}^{T} f (s, B_{s}, Y_{s}, Z_{s}) d s - \int_{t}^{T} Z_{s} d B_{s}, 0 \leq t \leq T .

Y_{t} = g (B_{T}) + \int_{t}^{T} f (s, B_{s}, Y_{s}, Z_{s}) d s - \int_{t}^{T} Z_{s} d B_{s}, 0 \leq t \leq T .

∣ g (x) - g (x^{'}) ∣ \leq ∥ g ∥_{ε} ∣ x - x^{'} ∣^{ε} .

∣ g (x) - g (x^{'}) ∣ \leq ∥ g ∥_{ε} ∣ x - x^{'} ∣^{ε} .

∣ f (t, x, y, z) - f (t^{'}, x^{'}, y^{'}, z^{'}) ∣ \leq ∥ f_{t} ∥_{α} ∣ t - t^{'} ∣^{α} + ∥ f_{x} ∥_{ε} ∣ x - x^{'} ∣^{ε} + ∥ f_{y} ∥_{Lip} ∣ y - y^{'} ∣ + ∥ f_{z} ∥_{Lip} ∣ z - z^{'} ∣ .

∣ f (t, x, y, z) - f (t^{'}, x^{'}, y^{'}, z^{'}) ∣ \leq ∥ f_{t} ∥_{α} ∣ t - t^{'} ∣^{α} + ∥ f_{x} ∥_{ε} ∣ x - x^{'} ∣^{ε} + ∥ f_{y} ∥_{Lip} ∣ y - y^{'} ∣ + ∥ f_{z} ∥_{Lip} ∣ z - z^{'} ∣ .

K_{f} := t \in [0, T] sup ∣ f (t, 0, 0, 0) ∣.

K_{f} := t \in [0, T] sup ∣ f (t, 0, 0, 0) ∣.

Y_{s}^{t, x} = g (B_{T}^{t, x}) + \int_{s}^{T} f (r, B_{r}^{t, x}, Y_{r}^{t, x}, Z_{r}^{t, x}) d r - \int_{s}^{T} Z_{r}^{t, x} d B_{r}, t \leq s \leq T,

Y_{s}^{t, x} = g (B_{T}^{t, x}) + \int_{s}^{T} f (r, B_{r}^{t, x}, Y_{r}^{t, x}, Z_{r}^{t, x}) d r - \int_{s}^{T} Z_{r}^{t, x} d B_{r}, t \leq s \leq T,

u (t, x) := Y_{t}^{t, x} = E [g (B_{T}^{t, x}) + \int_{t}^{T} f (r, B_{r}^{t, x}, Y_{r}^{t, x}, Z_{r}^{t, x}) d r] .

u (t, x) := Y_{t}^{t, x} = E [g (B_{T}^{t, x}) + \int_{t}^{T} f (r, B_{r}^{t, x}, Y_{r}^{t, x}, Z_{r}^{t, x}) d r] .

\nabla u (t, x) = E [g (B_{T}^{t, x}) \frac{B _{T} - B _{t}}{T - t} + \int_{t}^{T} f (r, B_{r}^{t, x}, Y_{r}^{t, x}, Z_{r}^{t, x}) \frac{B _{r} - B _{t}}{r - t} d r], (t, x) \in [0, T [\times R .

\nabla u (t, x) = E [g (B_{T}^{t, x}) \frac{B _{T} - B _{t}}{T - t} + \int_{t}^{T} f (r, B_{r}^{t, x}, Y_{r}^{t, x}, Z_{r}^{t, x}) \frac{B _{r} - B _{t}}{r - t} d r], (t, x) \in [0, T [\times R .

F (s, x) := f (s, x, u (s, x), \nabla u (s, x)) for (s, x) \in [0, T [\times R,

F (s, x) := f (s, x, u (s, x), \nabla u (s, x)) for (s, x) \in [0, T [\times R,

u (t, x) = E [g (B_{T}^{t, x}) + \int_{t}^{T} F (r, B_{r}^{t, x}) d r], (t, x) \in [0, T [\times R,

u (t, x) = E [g (B_{T}^{t, x}) + \int_{t}^{T} F (r, B_{r}^{t, x}) d r], (t, x) \in [0, T [\times R,

\nabla u (t, x) = E [g (B_{T}^{t, x}) \frac{B _{T} - B _{t}}{T - t} + \int_{t}^{T} F (r, B_{r}^{t, x}) \frac{B _{r} - B _{t}}{r - t} d r], (t, x) \in [0, T [\times R .

\nabla u (t, x) = E [g (B_{T}^{t, x}) \frac{B _{T} - B _{t}}{T - t} + \int_{t}^{T} F (r, B_{r}^{t, x}) \frac{B _{r} - B _{t}}{r - t} d r], (t, x) \in [0, T [\times R .

B_{t}^{n} := h k = 1 \sum [t / h] ξ_{k}, 0 \leq t \leq T,

B_{t}^{n} := h k = 1 \sum [t / h] ξ_{k}, 0 \leq t \leq T,

B_{s}^{n, t, x} := x + B_{s}^{n} - B_{t}^{n} .

B_{s}^{n, t, x} := x + B_{s}^{n} - B_{t}^{n} .

n_{t} := [t / h], \underline{t} := h [t / h] = h n_{t} and \overline{t} := h ⌈ t / h ⌉, t \in [0, T] .

n_{t} := [t / h], \underline{t} := h [t / h] = h n_{t} and \overline{t} := h ⌈ t / h ⌉, t \in [0, T] .

Y_{t}^{n} = g (B_{T}^{n}) + \int_{] t, T]} f (s, B_{s^{-}}^{n}, Y_{s^{-}}^{n}, Z_{s}^{n}) d ⟨ B^{n} ⟩_{s} - \int_{] t, T]} Z_{s}^{n} d B_{s}^{n}, t \in [0, T] .

Y_{t}^{n} = g (B_{T}^{n}) + \int_{] t, T]} f (s, B_{s^{-}}^{n}, Y_{s^{-}}^{n}, Z_{s}^{n}) d ⟨ B^{n} ⟩_{s} - \int_{] t, T]} Z_{s}^{n} d B_{s}^{n}, t \in [0, T] .

Y_{k h}^{n} = Y_{(k + 1) h}^{n} + h f ((k + 1) h, B_{k h}^{n}, Y_{k h}^{n}, Z_{(k + 1) h}^{n}) - h Z_{(k + 1) h}^{n} ξ_{k + 1}, Y_{nh}^{n} = g (B_{T}^{n}) .

Y_{k h}^{n} = Y_{(k + 1) h}^{n} + h f ((k + 1) h, B_{k h}^{n}, Y_{k h}^{n}, Z_{(k + 1) h}^{n}) - h Z_{(k + 1) h}^{n} ξ_{k + 1}, Y_{nh}^{n} = g (B_{T}^{n}) .

Z_{(k + 1) h}^{n}

Z_{(k + 1) h}^{n}

Y_{k h}^{n}

= E (Y_{(k + 1) h}^{n} ∣ F_{k h}^{n}) + h f ((k + 1) h, B_{k h}^{n}, Y_{k h}^{n}, Z_{(k + 1) h}^{n}),

D_{+}^{n} u (x) := \frac{1}{2} (u (x + h) + u (x - h)), D_{-}^{n} u (x) := \frac{1}{2} (u (x + h) - u (x - h)),

D_{+}^{n} u (x) := \frac{1}{2} (u (x + h) + u (x - h)), D_{-}^{n} u (x) := \frac{1}{2} (u (x + h) - u (x - h)),

\nabla^{n} u (x) := h^{- 1/2} D_{-}^{n} u (x) .

\nabla^{n} u (x) := h^{- 1/2} D_{-}^{n} u (x) .

\displaystyle\left\{\begin{array}[]{l}U^{n}(kh,x)=D^{n}_{+}U^{n}((k+1)h,x)+hf((k+1)h,x,U^{n}(kh,x),h^{-1/2}D^{n}_{-}U^{n}((k+1)h,x)),\\ U^{n}(nh,x)=g(x).\end{array}\right.

\displaystyle\left\{\begin{array}[]{l}U^{n}(kh,x)=D^{n}_{+}U^{n}((k+1)h,x)+hf((k+1)h,x,U^{n}(kh,x),h^{-1/2}D^{n}_{-}U^{n}((k+1)h,x)),\\ U^{n}(nh,x)=g(x).\end{array}\right.

Y_{k h}^{n}

Y_{k h}^{n}

Z_{k h}^{n}

Y_{t}^{n} = Y_{\underline{t}}^{n} = U^{n} (\underline{t}, B_{t}^{n}), t \in [0, T], and Z_{t}^{n} = Z_{\overline{t}}^{n} = \nabla^{n} U^{n} (\overline{t}, B_{t -}^{n}), t \in] 0, T] .

Y_{t}^{n} = Y_{\underline{t}}^{n} = U^{n} (\underline{t}, B_{t}^{n}), t \in [0, T], and Z_{t}^{n} = Z_{\overline{t}}^{n} = \nabla^{n} U^{n} (\overline{t}, B_{t -}^{n}), t \in] 0, T] .

Y_{s}^{n, \underline{t}, x} = g (B_{T}^{n, t, x}) + \int_{] s, T]} f (r, B_{r^{-}}^{n, t, x}, Y_{r^{-}}^{n, \underline{t}, x}, Z_{r}^{n, \underline{t}, x}) d ⟨ B^{n} ⟩_{r} - \int_{] s, T]} Z_{r}^{n, \underline{t}, x} d B_{r}^{n}, s \in [\underline{t}, T] .

Y_{s}^{n, \underline{t}, x} = g (B_{T}^{n, t, x}) + \int_{] s, T]} f (r, B_{r^{-}}^{n, t, x}, Y_{r^{-}}^{n, \underline{t}, x}, Z_{r}^{n, \underline{t}, x}) d ⟨ B^{n} ⟩_{r} - \int_{] s, T]} Z_{r}^{n, \underline{t}, x} d B_{r}^{n}, s \in [\underline{t}, T] .

Y_{s}^{n, t, x} = Y_{\underline{s}}^{n, \underline{t}, x} = U^{n} (s, B_{s}^{n, t, x}), 0 \leq t \leq s \leq T .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Donsker-Type Theorem for BSDEs:

Rate of Convergence

Philippe Briand In memory of Jean Mémin from whom I learned lots of mathematics.Many thanks to Pierre Baras for very fruitful discussions about the heat equation.

Christel Geiss

Stefan Geiss

Céline Labart 22footnotemark: 2

Abstract

In this paper, we study in the Markovian case the rate of convergence in the Wasserstein distance of an approximation of the solution to a BSDE given by a BSDE which is driven by a scaled random walk as introduced in Briand, Delyon and Mémin (Electron. Comm. Pro- bab. 6 (2001), 1–14).

1. Introduction

In this paper, we are concerned with the discretization of solutions to BSDEs of the form

[TABLE]

where $B$ is a standard Brownian motion. These equations have been introduced by Jean-Michel Bismut for linear generators in [2] and by Étienne Pardoux and Shige Peng for Lipschitz generators in [14].

In one of the first studies on this topic, in the case where the generator $f$ may depend on $z$ as well, Philippe Briand, Bernard Delyon and Jean Mémin [5] proposed an approximation based on Donsker’s theorem. They showed that the solution $(Y,Z)$ to the previous BSDE can be approximated by the solution $(Y^{n},Z^{n})$ to the BSDE

[TABLE]

where $B^{n}$ is the scaled random walk

[TABLE]

and $(\xi_{k})_{k\geq 1}$ is an i.i.d. sequence of symmetric Bernoulli random variables. They proved, in full generality, meaning that $G(B)$ is only required to be a square integrable random variable, that $(Y^{n},Z^{n})$ converges to $(Y,Z)$ . However, the question of the rate of convergence was left open. Right now it seems to be hopeless to get a result in this direction for such a general path-dependent terminal condition $G(B).$ But in the Markovian case, meaning that $G(B)=g(B_{T})$ , this problem seems to be tractable, in particular due to the PDE structure behind. Indeed, $(Y,Z)$ is related to the semilinear heat equation

[TABLE]

where, under certain regularity conditions, we can choose

[TABLE]

In the case where $B^{n}$ is the discretized Brownian motion, the link to PDEs was exploited in [18, 3] to get the rate of convergence, in the Markovian case, of the classical scheme for BSDEs. The convergence of this scheme was already proved in [6, Proposition 13] for a general terminal condition and a generator that is Lipschitz in its spatial coordinates but without any rate of convergence.

Even though the link with PDEs was pointed out in [5], the rate of convergence of the approximation of BSDEs given by scaled random walks was completely open. In two recent papers, Christel Geiss, Céline Labart and Antti Luoto [10, 9] give a first answer to this question. They showed that the error between $(Y^{n},Z^{n})$ and $(Y,Z)$ is of order $n^{-\varepsilon/4}$ when $g$ is assumed to be $\varepsilon$ -Hölder continuous and $f(t,\cdot)$ Lipschitz continuous. One of the main arguments in these papers consists in constructing the random walk from the Brownian motion $B$ using the Skorohod embedding (see [17]) together with generalizations of the pioneering work of Jin Ma and Jianfeng Zhang [13] on representation theorems for BSDEs. This approach allows to work with convergence in the $\mathrm{L}^{2}$ -sense even if the problem naturally arises in the weak sense. The drawback is that the rate of convergence $n^{-\varepsilon/4}$ obtained in these papers is not optimal as one can expect $n^{-\varepsilon/2}$ .

The objective of our study is to confirm this expected rate $n^{-\varepsilon/2}$ . This improvement was possible by using a weak limit approach, where the error is considered in the Wasserstein distance. Our starting point is a result of Emmanuel Rio [15] who proved that, when $T=1$ , for all $r\geq 1$ , there exists a constant $C_{r}$ such that, for all $n\geq 1$ , $W_{r}(B^{n}_{1},G)\leq C_{r}\,n^{-1/2}$ , where $W_{r}$ is the $\mathrm{L}^{r}$ -Wasserstein distance and $G$ a standard normal random variable (see Section 3). Firstly, we generalize this result to cover the case where $f\equiv 0$ which corresponds to the heat equation. Then, using the associated PDE, in particular representation formulas in the spirit of [13], we are able to prove that

[TABLE]

for $t\in[0,T]$ and $t\in[0,T[$ , respectively, when $g$ and $f(t,\cdot,y,z)$ are $\varepsilon$ -Hölder continuous and $f(\cdot,x,y,z)$ is $\alpha$ -Hölder continuous. We refer to Theorem 10 in Section 5 for the precise statement. One of the main difficulties in the proof concerned various gradient estimates in order to obtain the estimate for $W_{r}(Z^{n}_{t},Z_{t})$ .

For $\varepsilon=1$ and $\alpha\geq 1/2$ we obtain the rate $n^{-\frac{1}{2}}$ which is the same rate as obtained from Rio for the Random walk approximation of a Gaussian random variable in the Wasserstein distance as mentioned above.

2. Notation

In all the sequel, $T>0$ is a fixed positive real number. We work on a complete probability space $(\Omega,\mathcal{F},\mathbb{P})$ carrying a standard real Brownian motion $\{B_{t}\}_{0\leq t\leq T},$ and $\{\mathcal{F}_{t}\}_{0\leq t\leq T}$ stands for the augmented filtration of $B$ which is right continuous and complete.

We consider the following BSDE

[TABLE]

Throughout this article, we will assume for the function $g$ defining the terminal condition and the generator $f$ the following:

Assumption (A1).

There exist $0<\varepsilon\leq 1$ and $0<\alpha\leq 1$ such that it holds:

(i)

The function $g:\mathbb{R}\longrightarrow\mathbb{R}$ is $\varepsilon$ -Hölder continuous: for all $(x,x^{\prime})\in\mathbb{R}^{2}$ one has

[TABLE] 2. (ii)

The function $f:[0,T]\times\mathbb{R}\times\mathbb{R}\times\mathbb{R}\longrightarrow\mathbb{R}$ is $\alpha$ -Hölder continuous in time, $\varepsilon$ -Hölder continuous in space and Lipschitz continuous with respect to $(y,z)$ : for all $(t,x,y,z)$ and $(t^{\prime},x^{\prime},y^{\prime},z^{\prime})$ in $[0,T]\times\mathbb{R}\times\mathbb{R}\times\mathbb{R}$ one has

[TABLE]

Most of the time, we do not need to distinguish between $\|f_{y}\|_{\text{\rm Lip}}$ and $\|f_{z}\|_{\text{\rm Lip}}$ and we let $\|f\|_{\text{\rm Lip}}:=\max\left(\|f_{y}\|_{\text{\rm Lip}},\|f_{z}\|_{\text{\rm Lip}}\right)$ .

Convention: Later the phrase that a constant $C>0$ depends on $(T,\varepsilon,f,g)$ stands for the fact that $C$ can be expressed in terms of $(T,\varepsilon,\|f_{x}\|_{\varepsilon},\|f_{y}\|_{\text{\rm Lip}},\|f_{z}\|_{\text{\rm Lip}},K_{f},\|g\|_{\varepsilon},g(0))$ where

[TABLE]

Similarly, a dependence on $(T,\alpha,\varepsilon,f,g)$ means an additional dependence on $(\alpha,\|f_{t}\|_{\alpha})$ .

From [4, Theorem 4.2] it is known that under (A1), the BSDE (2) has a unique $\mathrm{L}^{p}$ -solution $(Y,Z)$ for any $p\in]1,\infty[.$ So for $(t,x)\in[0,T[\times\mathbb{R}$ we let $\left(Y^{t,x}_{s},Z^{t,x}_{s}\right)_{s\in[t,T]}$ be the square integrable solution to the BSDE

[TABLE]

where $B^{t,x}_{r}:=x+B_{r}-B_{t}$ , and set, as usual, for $x\in\mathbb{R}$ , $u(T,x):=g(x)$ , and, for $(t,x)\in[0,T[\times\mathbb{R}$ ,

[TABLE]

It is well known that the function $u$ is continuous on $[0,T]\times\mathbb{R}$ (see also Lemma 6 below) and under Lipschitz assumptions in $(x,y,z)$ and for $\alpha\geq\tfrac{1}{2}$ it is the viscosity solution to (1), see [20, Theorem 5.5.8]. Moreover, in this Markovian setting, for $(t,x)\in[0,T]\times\mathbb{R}$ , we have $Y^{t,x}_{s}=u(s,B^{t,x}_{s})$ a.s. for all $s\in[t,T]$ . In [19, Theorem 3.2], for a generator which is Lipschitz continuous in all space variables and a measurable $g$ with polynomial growth, J. Zhang proved that $u$ belongs to $\mathcal{C}^{0,1}\left([0,T[\times\mathbb{R}\right)$ and that $Z^{t,x}_{s}=\nabla u(s,B^{t,x}_{s})$ a.e. on $[t,T[\times\Omega$ . Moreover, the following representation holds

[TABLE]

If $F$ is the function given by

[TABLE]

we thus have

[TABLE]

together with

[TABLE]

These formulas play an important role in the sequel.

In Section 4 and in the appendix, we extend these results to the case where $f(t,\cdot,y,z)$ is $\varepsilon$ -Hölder continuous and make the regularity of $u$ and $\nabla u$ precise.

As mentioned before, we are concerned with the approximation of the solution $\left(Y^{t,x},Z^{t,x}\right)$ to (4) by a solution to the BSDE driven by a scaled random walk. To do this, let us consider, on some probability space, not necessarily $\left(\Omega,\mathcal{F},\mathbb{P}\right)$ , an i.i.d. sequence $(\xi_{k})_{k\geq 1}$ of symmetric Bernoulli random variables. For $n\in\mathbb{N}^{*}:=\{1,2,3,...\}$ we set $h:=T/n$ and we consider the scaled random walk

[TABLE]

where $[x]:=\max\{r\in\mathbb{Z}:r\leq x\}$ for any real number $x.$ As we did for the Brownian motion, for $x\in\mathbb{R}$ and $0\leq t\leq s\leq T$ we put

[TABLE]

Let us introduce some further notation. We denote the ceiling function by $\lceil x\rceil:=\min\{r\in\mathbb{Z}:r\geq x\}$ for $x\in\mathbb{R}.$ Moreover, we set

[TABLE]

For $n\in\mathbb{N}^{*}$ let us consider the following BSDE driven by $B^{n}$ :

[TABLE]

It was shown in [5] that, as soon as $h\,\|f\|_{\text{\rm Lip}}<1$ , this BSDE has a unique square integrable solution $(Y^{n},Z^{n})$ , $Y^{n}$ being adapted and $Z^{n}$ being predictable with respect to the filtration generated by $B^{n}$ . By construction, $Y^{n}$ is a piecewise constant càdlàg process with $Y^{n}_{t}=Y^{n}_{\underline{t}}$ . The process $Z^{n}$ is defined as an element of $\mathrm{L}^{2}(\Omega\times[0,T],d\mathbb{P}\otimes d\langle B^{n}\rangle),$ where we start with a $Z^{n}$ defined only on the points $\{kh:k=1,\ldots,n\}$ and extend it to $]0,T]$ as a càglàd process $(Z^{n}_{t})_{t\in]0,T]}$ by setting $Z^{n}_{t}=Z^{n}_{\overline{t}}$ . The previous BSDE is actually a discrete BSDE that can be solved by hand since, for $k=0,\cdots,n-1$ , we have

[TABLE]

Thus, if $Y^{n}_{(k+1)h}$ is given,

[TABLE]

where the last equality follows by taking the conditional equation w.r.t. $\mathcal{F}^{n}_{kh}$ of the second line.

Since we are in a Markovian setting, there is also an analog of the Feynman-Kac formula. If $u$ is a given function we set

[TABLE]

and

[TABLE]

*Remark 1**.*

From the definition of $D^{n}_{+}$ and $D^{n}_{-}$ , we get that if $u$ is $\varepsilon$ -Hölder, $D^{n}_{+}u$ and $D^{n}_{-}u$ are also $\varepsilon$ -Hölder with constant $\|u\|_{\varepsilon}$ .

Let $U^{n}$ be the solution to the finite difference equation, where for $x\in\mathbb{R}$ and $k=0,\ldots,n-1$ we require

[TABLE]

Then, we obtain from (8) and (9) (cf [5, Proposition 5.1]) that

[TABLE]

These formulas rewrite in continuous time to

[TABLE]

If we set, for $0\leq t\leq T$ and $x\in\mathbb{R}$ , $U^{n}(t,x):=U^{n}(\underline{t},x)$ , we have $Y^{n}_{t}=U^{n}(t,B^{n}_{t})$ .

More generally, for $0\leq t<T$ , we define $(Y^{n,t,x},Z^{n,t,x})$ as the solution $Y^{n,\underline{t},x}=(Y^{n,\underline{t},x}_{s})_{s\in[\underline{t},T]}$ and $Z^{n,\underline{t},x}=(Z^{n,\underline{t},x}_{s})_{s\in]\underline{t},T]}$ to the BSDE

[TABLE]

We set $Y^{n,T,x}_{T}=g(x)$ . Then,

[TABLE]

Let us observe that $Z^{n,\underline{t},x}$ is first defined at the points $t=kh$ , $k=n_{t}+1,\ldots,n$ . As before we let $Z^{n,\underline{t},x}_{s}:=Z^{n,\underline{t},x}_{\overline{s}}$ for $s\in\,]\underline{t},T]$ . We have

[TABLE]

In particular,

[TABLE]

Of course, we have

[TABLE]

Similarly, we define, for $(t,x)\in[0,T[\times\mathbb{R}$ ,

[TABLE]

With this notation, (16) rewrites as

[TABLE]

It follows that

[TABLE]

which rewrites, taking into account (15) and (18), to

[TABLE]

where

[TABLE]

We will prove in Section 5 that $(U^{n},\Delta^{n})$ converges to $(u,\nabla u)$ .

From now on we assume that $n\geq n_{0}(T,\|f\|_{\text{\rm Lip}})$ where $n_{0}(T,\|f\|_{\text{\rm Lip}})\in\mathbb{N}^{*}$ is the integer given in Lemma 12 in the appendix and which automatically implies also existence and uniqueness of solutions because $n_{0}>T\|f\|_{\text{\rm Lip}}.$

3. Scaled random walk and Wasserstein distance

One starting point of our paper is the following result of Emmanuel Rio [15] (Theorem 2.1); see also [16]. This result covers, up to a generalization, the case where the generator vanishes, i.e. $f\equiv 0$ .

Let $\psi$ be the convex function defined by $\psi(x)=e^{|x|}-1$ . The Orlicz norm associated to this function $\psi$ of any real random variable $X$ is given by

[TABLE]

Let us recall that, for any $r\geq 1$ ,

[TABLE]

Let $X$ and $Y$ be two random variables end let us denote by $\mu$ the law of $X$ and by $\nu$ the law of $Y$ . With the usual abuse of notation, the Wasserstein distance associated to $\psi$ is defined by

[TABLE]

Let $(X_{k})_{k\geq 1}$ be an i.i.d. sequence of random variables with $\mathbb{E}\left[X\right]=0$ , $\mathbb{E}\left[X^{2}\right]=1,$ and such that, for some $\sigma>0$ , $\mathbb{E}\left[e^{\sigma|X|}\right]<+\infty$ . Let $G$ be a standard normal random variable. In [15, Theorem 2.1], Emmanuel Rio proved that there exists a constant $C>0$ such that, for $n\geq 1$ ,

[TABLE]

As a byproduct, for any $r\geq 1$ , there exists a constant $c_{r}>0$ such that

[TABLE]

where $W_{r}$ stands for the $\mathrm{L}^{r}$ -Wasserstein distance

[TABLE]

We have also the result of Kantorovich-Rubinstein, i.e.

[TABLE]

*Remark 2**.*

We could also consider the case where $0<r<1$ by using the fact that, in this case, $\mathbb{E}(|X-Y|^{r})$ is a distance (see the arguments in [1, Section 7.1]). In general, we have $W_{p}(\mu,\nu)=W_{q}(\mu,\nu)$ for $0<p<q<\infty.$

Let us start with a straightforward generalization of Rio’s result.

Proposition 3.

There exists a $C>0$ such that, for all $x\in\mathbb{R}$ and all $0\leq t\leq s\leq T$ ,

[TABLE]

As a byproduct, taking into account (21), for any $r\geq 1$ , there exists a $c_{r}>0$ such that, for all $x\in\mathbb{R}$ and all $0\leq t\leq s\leq T$ ,

[TABLE]

Proof of Proposition 3.

We have, for any $x\in\mathbb{R}$ and all $0\leq t\leq s\leq T$ ,

[TABLE]

If $\underline{s}=\underline{t},$ then $B^{n}_{t}-B^{n}_{s}=0,$ and we have

[TABLE]

Let us assume that $\underline{t}<\underline{s}$ and let us write

[TABLE]

Let us treat each term separately. For the first one, Rio’s result gives

[TABLE]

and multiplying by $\sqrt{n_{s}-n_{t}}\sqrt{h}$ , we get, since $\sqrt{h(n_{s}-n_{t})}G$ is equal to $B_{\underline{s}}-B_{\underline{t}}$ in distribution,

[TABLE]

Let us deal with the second term of (25). Let $\beta(s,t):=\min(s-t,\underline{s}-\underline{t})$ . Then

[TABLE]

But $|s-t-(\underline{s}-\underline{t})|\leq h,$ and this concludes the proof. ∎

Let us finish with a simple consequence of this result that we will use in the sequel.

Corollary 4.

Let $0<\varepsilon\leq 1$ and let $g:\mathbb{R}\longrightarrow\mathbb{R}$ be an $\varepsilon$ -Hölder continuous function. Then there exists a $C>0$ depending on T such that, for all $x\in\mathbb{R}$ and all $0\leq t\leq s\leq T$ ,

[TABLE]

and, setting $\delta(t,s):=\max\left(s-t,\underline{s}-\underline{t}\right)$ ,

[TABLE]

Proof.

Let $x\in\mathbb{R}$ and $0\leq t\leq s\leq T$ . For any coupling $(X,Y)$ of $B^{n,t,x}_{s}$ and $B^{t,x}_{s}$ , using Hölder’s inequality when $0<\varepsilon<1$ ,

[TABLE]

Thus, we have, by (24) for $r=1$ ,

[TABLE]

Choosing $f(x)=x$ in (23), this implies the first result.

Let us prove the second assertion. We start by observing that, since $B_{s}-B_{t}$ and $B^{n}_{s}-B^{n}_{t}$ are centered random variables, we have, setting $h(y):=(g(x+y)-g(x))y$ ,

[TABLE]

Let us remark that, for any real numbers $y$ and $z$ , $|h(y)|\leq\|g\|_{\varepsilon}\,|y|^{1+\varepsilon}$ , and using the fact that $|y-z|^{1-\varepsilon}\leq|y|^{1-\varepsilon}+|z|^{1-\varepsilon}$ ,

[TABLE]

Young’s inequality, $|y|^{\varepsilon}\,|z|^{1-\varepsilon}\leq\varepsilon\,|y|+(1-\varepsilon)\,|z|\leq|y|+|z|$ , leads to

[TABLE]

In the case where $\underline{s}=\underline{t}$ we have

[TABLE]

where $\text{law}(G)=\mathcal{N}(0,1)$ . Since $\underline{s}=\underline{t}$ implies $(s-t)^{1/2}=\delta(t,s)^{1/2}$ and $(s-t)^{\varepsilon/2}\leq h^{\varepsilon/2},$ we have

[TABLE]

using the fact that $\mathbb{E}\left[|G|^{(1+\varepsilon)}\right]\leq 1$ .

Let us turn to the case $\underline{t}<\underline{s}$ . For any coupling $(X,Y)$ of $B^{n}_{s}-B^{n}_{t}$ and $B_{s}-B_{t}$ , using (22) and (26),

[TABLE]

and, by Hölder’s inequality with $p=2/\varepsilon$ and $q=2/(2-\varepsilon)$ ,

[TABLE]

From (22) it follows that

[TABLE]

where we have used (24) for $r=2$ .

Thus, for $0\leq t\leq s\leq T$ ,

[TABLE]

and the result follows as before by choosing $f(x)=x$ in (23). ∎

4. Regularity results on $u$ , $U^{n},$ $\nabla u$ and $\Delta^{n}$

Let us start by known regularity properties of the function $u$ that follow from classical a priori estimates for BSDEs.

Lemma 5.

Under Assumption (A1) there exists a constant $C>0$ depending on $(T,\varepsilon,f,g)$ such that, for all $(t,x)\in[0,T]\times\mathbb{R}$ ,

[TABLE]

Proof.

The first two results follow directly from classical a priori estimates for BSDEs, see e.g. [8, Proposition 2.1]. The last one ensues from the following upper bound: for any real $x$ and for $0\leq r\leq t\leq T$ ,

[TABLE]

Since the norm in $\mathcal{S}^{2}\times\mathcal{H}^{2}$ of $(Y^{r,x},Z^{r,x})$ is of order $(1+|x|)^{\varepsilon}$ , we use Cauchy-Schwarz inequality to bound the first term and a priori estimates enable (similarly as in the proof of [8, Proposition 4.1]) to bound the second term. ∎

Next we extend [19, Theorem 3.2] to the case where $f(t,\cdot,y,z)$ is Hölder continuous.

Lemma 6.

Recall the notation (5) and let Assumption (A1) hold.

(a)

The function $u$ belongs to $\mathcal{C}^{0,1}([0,T[\times\mathbb{R})$ and, for all $(t,x)\in[0,T[\times\mathbb{R}$ , we have,

[TABLE]

as well as (7) i.e.

[TABLE] 2. (b)

Moreover, there exists a constant $C>0$ depending on $(T,\varepsilon,f,g)$ such that,

(i)

$\|\nabla u(t,\cdot)\|_{\varepsilon}\leq\frac{C}{\sqrt{T-t}}$ * for all $t\in[0,T[,$ * 2. (ii)

$|\nabla u(t,x)|\leq\frac{C}{(T-t)^{(1-\varepsilon)/2}}$ * for all $(t,x)\in[0,T[\times\mathbb{R}.$ *

Consequently, for $\mathbb{E}_{r}:=\mathbb{E}[\,\cdot\,|\mathcal{F}_{r}]$ ,

[TABLE]

Proof of Lemma 6.

The proof is divided into two steps.

Step 1. We assume in addition that $f$ is Lipschitz continuous w.r.t. $x$ . Then according to [19], we have only the second point to prove and we know that, for some constant $C$ ,

[TABLE]

(bi) The representation (28) yields to

[TABLE]

Since $g$ is $\varepsilon$ -Hölder continuous we get

[TABLE]

Similarly, we obtain by the conditional Cauchy-Schwarz inequality the estimate

[TABLE]

Using (3) for $f$ , we have

[TABLE]

and the Hölder continuity of $u$ stated in Lemma 5 yields

[TABLE]

By combining the above estimates we conclude from (4) that

[TABLE]

for $C_{1}:=\|g\|_{\varepsilon}+2T(\|f_{x}\|_{\varepsilon}+C\,\|f_{y}\|_{\text{\rm Lip}}).$ Because of (29) we have

[TABLE]

with $C_{2}=C_{2}(C,T)>0$ . Hence we may apply Gronwall’s lemma (Lemma 14) and get

[TABLE]

for some $c_{0}=c_{0}(T,\|f_{z}\|_{\text{\rm Lip}})>0.$ Especially, for $r=t$ this implies

[TABLE]

for some $C=C(T,\varepsilon,f,g)>0.$

(bii) We first notice that for any $\varepsilon$ -Hölder continuous function $k$ and for all $0\leq t<s\leq T$ we have

[TABLE]

Therefore, we obtain from (7) that

[TABLE]

Using (31) for $s=t$ and taking into account that $\nabla u$ satisfies (bi) we get

[TABLE]

for some $C=C(T,\varepsilon,f,g)>0.$ This finishes the proof of the first step.

Step 2. General case. The proof relies on a regularization procedure and is postponed to appendix A.3. ∎

*Remark 7**.*

From now on we will always use the continuous version of $Z^{t,x}_{s}$ given by $\nabla u(s,B^{t,x}_{s})$ .

Lemma 8.

For all $(t,x)\in[0,T[\times\mathbb{R}$ and for $n\geq n_{0}\in\mathbb{N}^{*}$ , with $n_{0}$ defined as in Lemma 12, we have

(i)

$|U^{n}(t,x)|\leq C(1+|x|)^{\varepsilon},$ ** 2. (ii)

$|\Delta^{n}(t,x)|\leq\frac{C_{n}}{(T-t)^{(1-\varepsilon)/2}},$ **

where $C>0$ depends on $(T,\varepsilon,f,g)$ and $C_{n}>0$ depends on $(T,\varepsilon,f,g,n).$

Proof.

The result on $U^{n}$ ensues from Lemma 12, by choosing $\overline{f}=0$ and $\overline{g}=0$ . Let us prove the result on $\Delta^{n}$ . By (17) and (10) we have that

[TABLE]

We want to use (2), where we realize that

[TABLE]

A similar argument can be used for the integral expression so that we get

[TABLE]

Then

[TABLE]

Since $g$ is $\varepsilon$ -Hölder, $|G|$ is bounded by $\frac{\|g\|_{\varepsilon}}{(T-t)^{(1-\varepsilon)/2}}$ . Concerning the second term, we get, since $f$ satisfies (3),

[TABLE]

We will use that $U^{n}(t,x)$ and $\Delta^{n}(t,x)$ are $\varepsilon$ -Hölder continuous in $x$ , i.e.

[TABLE]

where $c(h)$ tends to infinity when $h$ tends to [math]. For $U^{n}$ , Lemma 12 with $(\bar{x},\bar{g},\bar{f})=(y,g,f)$ gives

[TABLE]

while for $\Delta^{n}$ this is an immediate consequence of Remark 1 and (35) with $c(h)={c_{0}+\frac{c_{0}}{\sqrt{h}}}.$ Then

[TABLE]

Since $\frac{1}{(\underline{s}-\underline{t})^{(1-\varepsilon)/2}}\leq\frac{1}{(s-(\underline{t}+h))^{(1-\varepsilon)/2}}$ for $s\in\,]\underline{t}+h,T]$ we get that $|F|\leq C_{n}$ .

∎

Proposition 9.

Under (A1), there exists a constant $C>0$ depending on $(T,\varepsilon,f,g)$ such that, for all $x\in\mathbb{R}$ ,

[TABLE]

Proof.

From Lemma 6, we know that, for $(t,x)\in[0,T[\times\mathbb{R}$ ,

[TABLE]

where we have set

[TABLE]

It holds $\mathbb{E}[H(t,s)]=0$ and $\|H(t,s)\|_{\mathrm{L}^{2}}=\frac{1}{\sqrt{s-t}}$ . We also have, for $0\leq r\leq t<s$ ,

[TABLE]

Let us observe that, for $0\leq r<t<s\leq T$ and any $\varepsilon$ -Hölder continuous function $h$ , it holds

[TABLE]

Indeed, we have

[TABLE]

and, from Cauchy-Schwarz inequality, we deduce that

[TABLE]

Coming back to (37), we write, for $0\leq r\leq t<T$ ,

[TABLE]

to have, taking into account the fact that $\|F(s,\cdot)\|_{\varepsilon}\leq C(T-s)^{-1/2}$ by (33),

[TABLE]

∎

5. Main results

In this section, we state the main result of this paper which gives the rate of convergence in the Wasserstein distance between the solution to the BSDE (4) and the solution to the BSDE driven by the scaled random walk (14). For the following we want to remind the reader of Remark 7.

Theorem 10.

Under (A1), for any $r\in[1,\infty[$ , there exists a constant $C_{r}>0$ depending at most on $(T,\alpha,\varepsilon,f,g,r)$ such that for all $x\in\mathbb{R}$ ,

(i)

$W_{r}\left(Y^{n,t,x}_{s},Y^{t,x}_{s}\right)\leq C_{r}\,(1+|x|)^{\varepsilon}\,n^{-(\alpha\wedge\frac{\varepsilon}{2})}\quad\text{ for all }\,\,0\leq t\leq s\leq T,$ ** 2. (ii)

$W_{r}\left(Z^{n,t,x}_{s},Z^{t,x}_{s}\right)\leq C_{r}\,\frac{(1+|x|)^{\varepsilon}}{\sqrt{T-s}}\,n^{-(\alpha\wedge\frac{\varepsilon}{2})}\quad\text{ for all}\,\,s\in[t,T[.$ **

This result is a consequence of the following proposition which gives the rate of the point-wise convergence of $U^{n},$ solution to (13), towards the solution $u$ of the semilinear heat equation (1).

Proposition 11.

Under (A1) there exists a constant $C>0$ depending at most on $(T,\alpha,\varepsilon,f,g)$ such that

(i)

$|u(t,x)-U^{n}(t,x)|\leq C\,(1+|x|)^{\varepsilon}\,n^{-(\alpha\wedge\frac{\varepsilon}{2})}\,\,\text{ for all }\,\,(t,x)\in\mathbb{R}\times[0,T],$ ** 2. (ii)

$|\nabla u(t,x)-\Delta^{n}(t,x)|\leq C\,\frac{(1+|x|)^{\varepsilon}}{\sqrt{T-t}}\,n^{-(\alpha\wedge\frac{\varepsilon}{2})}\,\,\text{ for all }\,\,(t,x)\in\mathbb{R}\times[0,T[.$ **

Proof.

We split the proof into three parts. We begin by studying $|u-U^{n}|$ , we proceed by obtaining an estimate for $\nabla u-\Delta^{n},$ and then we conclude with a Gronwall argument.

Estimate for $|u-U^{n}|$

From (6) and (2) we conclude that

[TABLE]

Let $F_{n}$ be the function given by

[TABLE]

Using the notation (20) we also have that $F_{n}(s,B^{n,t,x}_{s})=f(s,\Theta^{n,t,x}_{s})$ . With this notation in hand, we have, taking into account (15) and (18),

[TABLE]

In view of the regularity of $f$ in time, we have

[TABLE]

Moreover, the Cauchy-Schwarz inequality leads to

[TABLE]

and, taking into account the growth of $f$ , we have

[TABLE]

where we have used Lemma 12 to get

[TABLE]

Coming back to (38), we derive the following inequality

[TABLE]

From Corollary 4 we get

[TABLE]

We split the second term on the RHS of (5) into two parts

[TABLE]

Since $F$ has the regularity (33), Corollary 4 gives

[TABLE]

By the above estimates we derive from (5) the inequality

[TABLE]

Coming back to the definition of $F$ and $F_{n}$ (see (5) and (39)) and using the Lipschitz continuity of $f$ with respect to $(y,z)$ , we have

[TABLE]

Setting for simplicity, for $s\in[0,T]$ ,

[TABLE]

for $s\in[0,T[$ , Lemma 5, Lemma 6 and Lemma 8 imply that, for some $C>0$ and $C_{n}>0,$

[TABLE]

respectively. We deduce the following estimate

[TABLE]

and get, coming back to (41), for $0\leq t\leq T$ and for any $x\in\mathbb{R}$ ,

[TABLE]

We end up with the inequality

[TABLE]

and since $\gamma_{n}$ belongs to $\mathrm{L}^{1}[0,T],$ Gronwall’s inequality (Lemma 13) gives

[TABLE]

Estimate for $|\nabla u-\Delta^{n}|$

In order to take advantage of the previous inequality, we need to estimate $\gamma_{n}(s)$ . To do this, we use the representations (7) and (34). We will divide the study into two parts

[TABLE]

Study of the $g$ difference

We have

[TABLE]

For the first term, since

[TABLE]

we have, using the fact that $g$ is $\varepsilon$ -Hölder continuous,

[TABLE]

But, exploiting that $(t-\underline{t})^{{1-\frac{\varepsilon}{2}}}\leq(T-\underline{t})^{{1-\frac{\varepsilon}{2}}}$ and $\frac{1}{(T-\underline{t})^{\frac{\varepsilon}{2}}}\leq\frac{1}{(T-t)^{\frac{\varepsilon}{2}}}$ , we obtain

[TABLE]

from which we deduce that

[TABLE]

Since $\delta(t,T)=T-\underline{t}$ , from Corollary 4, the absolute value of the second term on the RHS of (5) is bounded by

[TABLE]

Then we get

[TABLE]

Study of the $f$ difference

Here we have to estimate for $t\in[0,T[$ ,

[TABLE]

When $T-h\leq t<s<T,$ we observe that $B^{n,t,x}_{s}-x=0,$ and combining the regularity (33) of $F$ with the estimate (32) we obtain

[TABLE]

Thus, for $T-h\leq t\leq T$ , $H(t)\leq C\,n^{-\frac{\varepsilon}{2}}$ .

Let us now consider the case where $0\leq t<T-h$ i.e. $\underline{t}+h\leq T-h$ . We first write

[TABLE]

For the second term of the RHS of this equality, we proceed as above and get

[TABLE]

But, since $\underline{t}+h\leq T-h$ , for $t<s<\underline{t}+h$ , $\dfrac{1}{\sqrt{T-s}}\leq\dfrac{1}{\sqrt{h}}$ and, since $0<\underline{t}+h-t\leq h$ ,

[TABLE]

Secondly, we split the term

[TABLE]

into two parts:

[TABLE]

and, the remaining term

[TABLE]

But, due to the uniform regularity of $f$ in time, we have, since $\underline{s}-\underline{t}=(\underline{s}+h)-(\underline{t}+h)\geq s-(\underline{t}+h)$ ,

[TABLE]

Thus, for $0\leq t<T-h$ ,

[TABLE]

We split the integrand of the first term on the RHS of the inequality into three parts,

[TABLE]

so that

[TABLE]

The term $H_{1}$

Since $B^{t,x}_{s}-x$ has mean zero,

[TABLE]

and the regularity (33) of $F$ gives

[TABLE]

Since $(t-\underline{t})=(t-\underline{t})^{\frac{\varepsilon}{2}}(t-\underline{t})^{1-\frac{\varepsilon}{2}}\leq h^{\frac{\varepsilon}{2}}(\underline{s}-\underline{t})^{1-\frac{\varepsilon}{2}}$ and the same upper bound holds for $s-\underline{s}$ (since $s-\underline{s}\leq h\leq\underline{s}-\underline{t}$ ), we get

[TABLE]

Finally, we remark that $\underline{s}-\underline{t}=(\underline{s}+h)-(\underline{t}+h)\geq s-(\underline{t}+h)$ and $s-t\geq s-(\underline{t}+h)$ , to obtain

[TABLE]

and, as a consequence,

[TABLE]

The term $H_{2}$

We use once again the regularity (33) of $F$ together with Corollary 4 to obtain, for $\underline{t}+h<s<T$ ,

[TABLE]

We first use the fact that $\delta(s,t)=\max(s-t,\underline{s}-\underline{t})\leq s-\underline{t}\leq s-\underline{s}+\underline{s}-\underline{t}\leq 2(\underline{s}-\underline{t})$ to get

[TABLE]

and, since $\underline{s}-\underline{t}\geq s-(\underline{t}+h)$ , we have

[TABLE]

from which we deduce the estimate

[TABLE]

The term $H_{3}$

For this last term, we come back to the definitions (5) and (39) of $F$ and $F_{n},$ respectively. By (42) we have, for $\underline{t}+h<s<T$ ,

[TABLE]

and, by the Cauchy-Schwarz inequality, we derive the estimate

[TABLE]

Summary for $H$

Let us summarize the estimates we got for $H$ . For $T-h\leq t\leq T$ , we have $H(t)\leq C\,n^{-\frac{\varepsilon}{2}}$ . For $0\leq t<T-h$ we obtained the upper bound

[TABLE]

Hence we have, for $t\in[0,T]$ ,

[TABLE]

Coming back to (45), we have, for any $x\in\mathbb{R}$ and $t\in[0,T[$ ,

[TABLE]

and, as a byproduct,

[TABLE]

Global estimate

Plugging (43) into (46), we get, for $t\in[0,T[$ ,

[TABLE]

For $t<T-h$ , since $\underline{s}-\underline{t}=(\underline{s}+h)-(\underline{t}+h)\geq s-(\underline{t}+h)$ , we have

[TABLE]

Again, we have,

[TABLE]

from which we deduce

[TABLE]

It follows that, for $t\in[0,T[$ ,

[TABLE]

Thus,

[TABLE]

But we have

[TABLE]

and, for $t<T-h$ , since $\underline{s}+h\leq r$ if and only if $s<\underline{r}$ ,

[TABLE]

But $\underline{r}-\underline{s}\geq\underline{r}-s$ and $\underline{s}-\underline{t}\geq s-(\underline{t}+h)$ , so we get

[TABLE]

Finally, we have

[TABLE]

and from Gronwall’s inequality (Lemma 13)

[TABLE]

Coming back to (43), we have also,

[TABLE]

The proof of Proposition 11 is complete. ∎

Proof of Theorem 10.

Theorem 10 is mainly a corollary of Proposition 11.

Let us begin with the convergence of the $(Y^{n})_{n}$ processes. Let us fix $r\geq 1$ , $x\in\mathbb{R}$ and $0\leq t\leq s\leq T$ . We have

[TABLE]

Since, by Lemma 5, $u$ is $\varepsilon$ -Hölder continuous in space, uniformly in time, we have, by Hölder’s inequality,

[TABLE]

where we have used Proposition 3 (see (24)) for the last inequality. Moreover, by Proposition 11,

[TABLE]

This gives the first part of the result.

Let us continue with the convergence of the $(Z^{n})_{n}$ processes. The proof is almost the same except for the grid points. Let $0\leq t<s\leq T$ with $s\neq\underline{s}$ i.e. $s\not\in\{kh,k=n_{t}+1,\ldots,n\}$ . We have as before

[TABLE]

Since, by Lemma 6, $\nabla u(s,\cdot)$ is $\varepsilon$ -Hölder continuous, we have, by Hölder’s inequality,

[TABLE]

where the last inequality follows from (24). Moreover, by Proposition 11,

[TABLE]

and the result follows, in this case, from equalities (27) and (18) together with Remark 7.

Let us now consider the case where $s\in\{kh,k=n_{t}+1,\ldots,n-1\}.$ In this case we have

[TABLE]

which is not equal to $\Delta^{n}\left(s,B^{n,t,x}_{s}\right)$ in general. We first write

[TABLE]

The second term on the RHS can be bounded by using our previous result, namely

[TABLE]

For the first term, one can write

[TABLE]

From Lemma 6, $\nabla u(s-h/2,\cdot)$ is $\varepsilon$ -Hölder continuous and we have

[TABLE]

where $G$ is a standard normal random variable. Finally, for the last term, we use Proposition 9 for the time regularity of $\nabla u$ . We have

[TABLE]

and the estimate for $W_{r}\left(Z^{n,t,x}_{s},Z^{t,x}_{s}\right)$ follows.

This ends the proof. ∎

Appendix A Appendix

A.1. A priori estimate for discrete BSDEs

For the convenience of the reader, we prove a generalization of an a priori estimate for BSDEs driven by random walks given in [6, Proposition 7] (see also the appendix in [5]). This generalization allows to consider two different generators.

Lemma 12.

There exists an integer $n_{0}\in\mathbb{N}^{*}$ and a constant $C>0$ both depending only on $T$ and $\|f\|_{\text{\rm Lip}}$ such that for any couple of functions $(\bar{g},\bar{f})$ satisfying (A1) and for all $n\geq n_{0}$ with $n>T\|\bar{f}\|_{\text{\rm Lip}}$ and all $(t,x,\bar{x})\in[0,T]\times\mathbb{R}^{2}$ ,

[TABLE]

where $\delta f(r,x,\bar{x},y,z)=f\left(r,B^{n,t,x}_{r^{-}},y,z\right)-\bar{f}\left(r,B^{n,t,\bar{x}}_{r^{-}},y,z\right)$ and, for all $(t,x)$ , $\left(\bar{Y}^{n,t,x},\bar{Z}^{n,t,x}\right)$ denotes the solution to (14) where $(g,f)$ is replaced by $(\bar{g},\bar{f})$ .

Proof.

Let $n$ be such that $T\|f\|_{\text{\rm Lip}}/n<1$ and $T\|\bar{f}\|_{\text{\rm Lip}}/n<1$ . Since, $\langle B^{n}\rangle_{t}-\langle B^{n}\rangle_{s}\leq(t-s)+T/n$ , doing exactly the same computation as in the proof of Proposition 7 in [6], we get, for a universal constant $c\geq 1$ ,

[TABLE]

for all deterministic $0\leq\sigma<\tau\leq T,$ where

[TABLE]

We choose an integer $m\in\mathbb{N}^{*}$ such that, with $\alpha=T/m$ , $C(\alpha,+\infty)\leq 1/3$ . Then, there exists an $n_{0}\in\mathbb{N}^{*}$ such that, for $n\geq n_{0}$ it holds $C(\alpha,n)\leq 1/2$ and $T\|f\|_{\text{\rm Lip}}/n<1.$ As soon as $\tau-\sigma\leq\alpha$ and $n\geq n_{0}$ , we have

[TABLE]

We set, for $0\leq k\leq m-1$ , $I_{k}=\underline{t}+]k(T-\underline{t})/m,(k+1)(T-\underline{t})/m]$ and we introduce the following norm on $\mathcal{S}^{2}\times\mathcal{M}^{2}$ :

[TABLE]

Considering $I_{k}=]\sigma,\tau]$ and summing up (47) over $k$ yields

[TABLE]

from which we get

[TABLE]

This finishes the proof since $\left\|\left(\delta Y^{n},\delta Z^{n}\right)\right\|^{2}_{s}$ upper bounds the LHS of the inequality stated in the Lemma. ∎

A.2. Gronwall lemmas

We recall the Gronwall lemmas used in this article.

Lemma 13.

Suppose that $g(t),\alpha(t):[0,T[\to[0,\infty[$ are integrable functions, and $\beta>0.$ For $0\leq t<T,$ if

[TABLE]

then

[TABLE]

The second lemma is of Volterra type. It can be either proved directly by a convolution argument or one can use [12, Exercise 4, page 190].

Lemma 14.

Assume a measurable $g:[0,T[\to[0,\infty[\in L_{1}([0,T[)$ and $\alpha,\beta>0$ such that

[TABLE]

for all $t\in[0,T[$ . Then $g(t)\leq C\frac{\alpha}{\sqrt{T-t}}$ for $t\in[0,T[$ for a constant $C=C(T,\beta)>0$ .

A.3. Proof of Lemma 6: Step 2

We assume that $0<\varepsilon<1$ ; the case $\varepsilon=1$ was treated in Step 1. For $\eta>0$ , let us consider the function

[TABLE]

When $f$ satisfies (3), then $f^{\eta}$ does as well and it is $\eta$ -Lipschitz continuous w.r.t. $x$ . Moreover

[TABLE]

with $c(t)=(\varepsilon t)^{\varepsilon/(1-\varepsilon)}t(1-\varepsilon)$ . Indeed,

[TABLE]

In particular, $|K_{f}-K_{f^{\eta}}|\leq c(\|f_{x}\|_{\varepsilon})\,\eta^{-\varepsilon/(1-\varepsilon)}$ . Let $(Y^{t,x,\eta},Z^{t,x,\eta})$ be the solution to the BSDE (4) with data $(g,f^{\eta})$ and $u^{\eta}$ be the function $u^{\eta}(t,x):=Y^{t,x,\eta}_{t}$ . By the usual classical estimate for BSDEs (see, for instance, [8, Proposition 2.1 and remarks] or [11, Lemma 5.26]), there exists a $C>0$ such that, for all $(t,x)\in[0,T]\times\mathbb{R}$ ,

[TABLE]

In particular, $(u^{\eta})_{\eta}$ converges to $u$ , as $\eta\to+\infty$ , uniformly on $[0,T]\times\mathbb{R}$ .

Proof of (a), (bi), and (bii). Since $f^{\eta}$ is Lipschitz continuous w.r.t. $(x,y,z)$ and satisfies (3) (uniformly in $\eta$ ), by Step 1, we know that $u^{\eta}\in\mathcal{C}^{0,1}([0,T[\times\mathbb{R})$ and $Z^{t,x,\eta}_{s}=\nabla u^{\eta}(s,B^{t,x}_{s})$ for a.e. $(s,\omega)\in[t,T[\times\Omega$ with

[TABLE]

Taking into account the convergence of $(Z^{t,x,\eta})_{\eta}$ to $Z^{t,x}$ in $\mathrm{L}^{2}([t,T[\times\Omega)$ , we have

[TABLE]

We define the function

[TABLE]

First we show that

[TABLE]

which also implies that $v:[0,T[\times\mathbb{R}\to\mathbb{R}$ is measurable. For this we denote

[TABLE]

We also use (28) for $(g,f^{\eta})$ so that

[TABLE]

Then we apply the conditional Cauchy-Schwarz inequality to $\tfrac{(f^{\eta}-f)}{(s-r)^{\varepsilon/4}}\tfrac{B_{s}-B_{r}}{(s-r)^{1-\varepsilon/4}}$ and use (48) to get

[TABLE]

Taking into account the bound for $Y$ given in (A.3), we have

[TABLE]

where $C$ depends on $T$ , $\varepsilon$ , $\|f_{x}\|_{\varepsilon}$ and $\|f\|_{\text{\rm Lip}}$ .

Now we find a sequence $\eta_{k}\uparrow+\infty$ such that the RHS tends to [math]. Indeed, this follows by dominated convergence as (A.3) guarantees a sequence $\eta_{k}\uparrow+\infty$ such that

[TABLE]

and because of the equations (50) and (51). For $r\geq t$ , $\nabla u^{\eta}(r,B_{r}^{t,x})$ converges to $\widehat{Z}_{r}^{t,x}$ in $\mathrm{L}^{2}$ and, in particular, for $r=t$ we obtain the desired convergence $\nabla u^{\eta}(t,x)\to v(t,x)$ . Because of $Z^{t,x,\eta}_{s}=\nabla u^{\eta}(s,B^{t,x}_{s})$ for a.e. $(s,\omega)\in[t,T[\times\Omega$ this also gives that

[TABLE]

Coming back to (52), Gronwall’s lemma (Lemma 14) gives,

[TABLE]

Especially, $|\nabla u^{\eta}(t,x)-v(t,x)|\leq C\eta^{-\varepsilon/(1-\varepsilon)}$ , so that $\lim_{\eta\to\infty}\nabla u^{\eta}(t,x)=v(t,x)$ .

Moreover, we conclude from this estimate and from the continuity of $\nabla u^{\eta}$ on $[0,T[\times\mathbb{R}$ that also $v$ is continuous. Finally, $v(t,x)=\nabla u(t,x)$ follows from taking the limit $\eta\uparrow+\infty$ in

[TABLE]

where we use dominated convergence based on the inequality (50).

Bibliography20

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] L. Ambrosio, N. Gigli and G. Savaré, Gradient Flows in Metric Spaces and in the Space of Probability Measures , Birkhäuser, Basel, Boston, Berlin (2005).
2[2] J.M. Bismut, Théorie probabiliste du contrôle des diffusions , Mem. AMS 176 (1973).
3[3] B. Bouchard and N. Touzi, Discrete-time approximation and Monte-Carlo simulation of backward stochastic differential equations , Stochastic Process. Appl. 111 (2004), no. 2, 175–206.
4[4] P. Briand, B. Delyon, Y. Hu, E. Pardoux and L. Stoica, Lp solutions of backward stochastic differential equations , Stochastic Process. Appl. 108 (2003) , 109–129.
5[5] Ph. Briand, B. Delyon, and J. Mémin, Donsker–type theorem for BSD Es , Electron. Comm. Probab. 6 (2001), 1–14, (electronic).
6[6] , On the robustness of backward stochastic differential equations , Stochastic Process. Appl. 97 (2002), no. 2, 229–253.
7[7] J. Jacod and A. N. Shiryaev, Limit theorems for stochastic processes , Springer (2003).
8[8] N. El Karoui, S. Peng, and M.C. Quenez, Backward Stochastic Differential Equations in Finance , Math. Finance 7 (1997), 1–71.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Donsker-Type Theorem for BSDEs:

Abstract

1. Introduction

2. Notation

Assumption (A1)****.

Remark 1*.*

3. Scaled random walk and Wasserstein distance

Remark 2*.*

Proposition 3**.**

Proof of Proposition 3.

Corollary 4**.**

Proof.

4. Regularity results on uuu, Un,U^{n},Un, ∇u\nabla u∇u and Δn\Delta^{n}Δn

Lemma 5**.**

Proof.

Lemma 6**.**

Proof of Lemma 6.

Remark 7*.*

Lemma 8**.**

Proof.

Proposition 9**.**

Proof.

5. Main results

Theorem 10**.**

Proposition 11**.**

Proof.

Estimate for ∣u−Un∣|u-U^{n}|∣u−Un∣

Estimate for ∣∇u−Δn∣|\nabla u-\Delta^{n}|∣∇u−Δn∣

Study of the ggg difference

Study of the fff difference

The term H1H_{1}H1​

The term H2H_{2}H2​

The term H3H_{3}H3​

Summary for HHH

Global estimate

Proof of Theorem 10.

Appendix A Appendix

A.1. A priori estimate for discrete BSDEs

Lemma 12**.**

Proof.

A.2. Gronwall lemmas

Lemma 13**.**

Lemma 14**.**

A.3. Proof of Lemma 6: Step 2

Assumption (A1).

*Remark 1**.*

*Remark 2**.*

Proposition 3.

Corollary 4.

4. Regularity results on $u$ , $U^{n},$ $\nabla u$ and $\Delta^{n}$

Lemma 5.

Lemma 6.

*Remark 7**.*

Lemma 8.

Proposition 9.

Theorem 10.

Proposition 11.

Estimate for $|u-U^{n}|$

Estimate for $|\nabla u-\Delta^{n}|$

Study of the $g$ difference

Study of the $f$ difference

The term $H_{1}$

The term $H_{2}$

The term $H_{3}$

Summary for $H$

Lemma 12.

Lemma 13.

Lemma 14.