An optimal polynomial approximation of Brownian motion

James Foster; Terry Lyons; Harald Oberhauser

arXiv:1904.06998·math.NA·May 21, 2020·SIAM J. Numer. Anal.

An optimal polynomial approximation of Brownian motion

James Foster, Terry Lyons, Harald Oberhauser

PDF

1 Repo

TL;DR

This paper introduces an optimal polynomial-based method for approximating Brownian motion, leveraging orthogonal polynomials with Gaussian coefficients, leading to improved numerical solutions for stochastic differential equations.

Contribution

It presents a new strong approximation of Brownian motion using orthogonal polynomials with independent Gaussian coefficients, optimizing in a weighted $L^{2}$ sense and enhancing SDE discretization.

Findings

01

Orthogonal polynomial expansion provides an optimal pathwise approximation.

02

Piecewise parabola discretization yields higher order numerical methods.

03

Demonstrated improved simulation of Inhomogeneous Geometric Brownian Motion.

Abstract

In this paper, we will present a strong (or pathwise) approximation of standard Brownian motion by a class of orthogonal polynomials. The coefficients that are obtained from the expansion of Brownian motion in this polynomial basis are independent Gaussian random variables. Therefore it is practical (requires $N$ independent Gaussian coefficients) to generate an approximate sample path of Brownian motion that respects integration of polynomials with degree less than $N$ . Moreover, since these orthogonal polynomials appear naturally as eigenfunctions of an integral operator defined by the Brownian bridge covariance function, the proposed approximation is optimal in a certain weighted $L^{2} (P)$ sense. In addition, discretizing Brownian paths as piecewise parabolas gives a locally higher order numerical method for stochastic differential equations (SDEs) when compared to the…

Figures8

Click any figure to enlarge with its caption.

Tables4

Symbol	Meaning	Page
$W$	a standard real-valued Brownian motion.	2
$B$	a standard real-valued Brownian bridge on $[0, 1]$ .	5
$μ$	a Borel measure on $[0, 1]$ defined by a singular weight function.	5
	$\begin{matrix} μ (a, b) = \int_{a}^{b} \frac{1}{x (1 - x)} 𝑑 x, \end{matrix}$
	for all open intervals $(a, b) \subset [0, 1]$ .
${e_{k}}_{k \geq 1}$	a family of Jacobi-like polynomials with $\deg (e_{k}) = k + 1$ that are orthogonal with respect to weight function $w (x) := \frac{1}{x (1 - x)} .$	5
$I_{k}$	a time integral of $B$ times the polynomial $e_{k} (t) w (t)$ over $[0, 1]$ ,	5
	$\begin{matrix} I_{k} = \int_{0}^{1} B_{t} \cdot \frac{e_{k} (t)}{t (1 - t)} 𝑑 t . \end{matrix}$
$K_{B}$	the covariance function of $B$ , that is $K_{B} (s, t) = \min (s, t) - s t$ .	5
$P_{k}^{(α, β)}$	the $k$ -th order $(α, β)$ -Jacobi polynomial on $[- 1, 1]$ $(α, β > - 1)$ .	11
$Q_{k}$	the $k$ -th order Legendre polynomial on $[- 1, 1]$ , i.e. $Q_{k} = P_{k}^{(0, 0)}$ .	13
$y$	a solution of the Stratonovich SDE on the finite interval $[0, T]$ ,	14
	$\begin{matrix} d y_{t} & = f_{0} (y_{t}) d t + f_{1} (y_{t}) \circ d W_{t}, \\ y_{0} & = ξ, \end{matrix}$
	where $y, ξ \in ℝ^{e}$ , and $f_{i} : ℝ^{e} \to ℝ^{e}$ denote smooth vector fields.
	(Itô SDEs will be defined on fixed intervals with the same form)
$[s, t]$	a general closed subinterval of $[0, T]$ , usually considered small.	14
$h$	the step size that a numerical method uses, typically $h = t - s$ .	14
$W_{s, t}$	the increment of Brownian motion over $[s, t]$ , $W_{s, t} := W_{t} - W_{s}$ .	14
$\wideparen W$	the Brownian parabola corresponding to $W$ over some interval.	15
	$\begin{matrix} \wideparen W_{u} = W_{s} + \frac{u - s}{h} W_{s, t} + \frac{6 (u - s) (t - u)}{h^{2}} H_{s, t}, \forall u \in [s, t] . \end{matrix}$
$Z$	the Brownian arch corresponding to $W$ defined as $Z := W - \wideparen W$ .	15
$H_{s, t}$	the rescaled space-time Lévy area of Brownian motion on $[s, t]$ ,	15
	$\begin{matrix} H_{s, t} & = \frac{1}{h} \int_{s}^{t} W_{s, u} - \frac{u - s}{h} W_{s, t} d u . \end{matrix}$
$L_{s, t}$	the space-space-time Lévy area of Brownian motion over $[s, t]$ ,	17
	$\begin{matrix} L_{s, t} & = \frac{1}{6} (\int_{s}^{t} \int_{s}^{u} \int_{s}^{v} \circ d W_{r} \circ d W_{v} d u - 2 \int_{s}^{t} \int_{s}^{u} \int_{s}^{v} \circ d W_{r} d v \circ d W_{u} \\ + \int_{s}^{t} \int_{s}^{u} \int_{s}^{v} d r \circ d W_{v} \circ d W_{u}), \end{matrix}$
$Y$	an approximation for the true solution $y$ of a Stratonovich SDE.	19
$[\cdot, \cdot]$	the standard Lie bracket of vector fields, $[f_{0}, f_{1}] = f_{1}^{'} f_{0} - f_{0}^{'} f_{1}$ .	19

Table 2. Table 2: Estimated simulation times for computing 100,000 sample paths that achieve a given accuracy using a single-threaded C++ program on a desktop computer.

	Log-ODE	Parabola	Linear	Milstein	Euler
Estimated time to achieve	0.179	0.405	1.47	15.4	0.437
an accuracy of $S_{N} = 10^{- 4}$	(s)	(s)	(s)	(s)	(days)
Estimated time to achieve	0.827	3.90	14.9	157	61.2
an accuracy of $S_{N} = 10^{- 5}$	(s)	(s)	(s)	(s)	(days)

Table 3. Table 3: Simulation times for computing 100,000 sample paths with 100 100 100 steps per path using a single-threaded C++ program on a desktop computer.

	Log-ODE	Parabola	Linear	Milstein	Euler
Computation time (s)	2.44	2.95	1.48	1.18	1.17

Table 4. Table 4: Estimated simulation times for computing 100,000 sample paths that achieve a given accuracy using a single-threaded C++ program on a desktop computer.

	Log-ODE	Parabola	Linear	Milstein	Euler
Estimated time to achieve	$<$ 0.240	1.69	2.15	2.78	25.5
an accuracy of $E_{N} = 10^{- 5}$	(s)	(s)	(s)	(s)	(s)
Estimated time to achieve	0.240	16.9	21.6	24.1	252
an accuracy of $E_{N} = 10^{- 6}$	(s)	(s)	(s)	(s)	(s)

Equations303

\int_{0}^{1} u^{k} d W_{u}^{n}

\int_{0}^{1} u^{k} d W_{u}^{n}

\wideparen W_{1} = W_{1}, \int_{0}^{1} \wideparen W_{u} d u = \int_{0}^{1} W_{u} d u .

\wideparen W_{1} = W_{1}, \int_{0}^{1} \wideparen W_{u} d u = \int_{0}^{1} W_{u} d u .

\displaystyle\mathbb{E}\left[\hskip 0.7113pt\int_{0}^{1}W_{u}^{2}\,du\,\Big{|}\,W_{1}\hskip 0.7113pt,\int_{0}^{1}W_{u}\,du\hskip 0.7113pt\right]=\int_{0}^{1}\wideparen{W}_{u}^{2}\,du+\frac{1}{15}\hskip 0.7113pt.

\displaystyle\mathbb{E}\left[\hskip 0.7113pt\int_{0}^{1}W_{u}^{2}\,du\,\Big{|}\,W_{1}\hskip 0.7113pt,\int_{0}^{1}W_{u}\,du\hskip 0.7113pt\right]=\int_{0}^{1}\wideparen{W}_{u}^{2}\,du+\frac{1}{15}\hskip 0.7113pt.

d y_{t}

d y_{t}

\wideparen y_{t} + \frac{1}{30} t^{2} σ^{2} Δ f (y_{0}),

\wideparen y_{t} + \frac{1}{30} t^{2} σ^{2} Δ f (y_{0}),

d \wideparen y_{t}

d \wideparen y_{t}

\wideparen y_{0}

\displaystyle\operatorname{Var}\left(\int_{0}^{1}W_{u}^{2}\,du\,\Big{|}\,W_{1}\hskip 0.7113pt,\int_{0}^{1}W_{u}\,du\right)=\frac{11}{6300}+\frac{1}{180}\hskip 0.7113ptW_{1}^{2}+\frac{1}{175}\left(\hskip 0.7113pt\int_{0}^{1}W_{u}\,du-\frac{1}{2}\hskip 0.7113ptW_{1}\right)^{2}.

\displaystyle\operatorname{Var}\left(\int_{0}^{1}W_{u}^{2}\,du\,\Big{|}\,W_{1}\hskip 0.7113pt,\int_{0}^{1}W_{u}\,du\right)=\frac{11}{6300}+\frac{1}{180}\hskip 0.7113ptW_{1}^{2}+\frac{1}{175}\left(\hskip 0.7113pt\int_{0}^{1}W_{u}\,du-\frac{1}{2}\hskip 0.7113ptW_{1}\right)^{2}.

d y_{t} = a (b - y_{t}) d t + σ y_{t} d W_{t},

d y_{t} = a (b - y_{t}) d t + σ y_{t} d W_{t},

ϕ^{q, p} (t)

ϕ^{q, p} (t)

\int_{0}^{1} ϕ^{q, p} (t) ϕ^{q, r} (t) d t

\int_{0}^{1} t^{k} ϕ^{q, p} (t) d t

ϕ_{nk}^{q, p} (t) := \frac{1}{2 ^{n}} ϕ^{q, p} (2^{n} t - k),

ϕ_{nk}^{q, p} (t) := \frac{1}{2 ^{n}} ϕ^{q, p} (2^{n} t - k),

μ (a, b) := \int_{a}^{b} \frac{1}{x ( 1 - x )} d x, for all open intervals (a, b) \subset [0, 1] .

μ (a, b) := \int_{a}^{b} \frac{1}{x ( 1 - x )} d x, for all open intervals (a, b) \subset [0, 1] .

\int_{0}^{1} e_{i} e_{j} d μ = δ_{ij},

\int_{0}^{1} e_{i} e_{j} d μ = δ_{ij},

B = k = 1 \sum \infty I_{k} e_{k},

B = k = 1 \sum \infty I_{k} e_{k},

I_{k} := \int_{0}^{1} B_{t} \cdot \frac{e _{k} ( t )}{t ( 1 - t )} d t,

I_{k} := \int_{0}^{1} B_{t} \cdot \frac{e _{k} ( t )}{t ( 1 - t )} d t,

Var (I_{k}) = \frac{1}{k ( k + 1 )} .

Var (I_{k}) = \frac{1}{k ( k + 1 )} .

∥ X ∥_{L_{μ}^{2} (P)} := E [\int_{0}^{1} (X_{s})^{2} d μ (s)],

∥ X ∥_{L_{μ}^{2} (P)} := E [\int_{0}^{1} (X_{s})^{2} d μ (s)],

\displaystyle\mathbb{E}\left[\hskip 0.7113pt\int_{0}^{1}(B_{s})^{2}\,d\mu(s)\right]=\int_{0}^{1}\mathbb{E}\big{[}(B_{s})^{2}\big{]}\,d\mu(s)=\int_{0}^{1}s(1-s)\cdot\frac{1}{s(1-s)}\,ds=1<\infty.

\displaystyle\mathbb{E}\left[\hskip 0.7113pt\int_{0}^{1}(B_{s})^{2}\,d\mu(s)\right]=\int_{0}^{1}\mathbb{E}\big{[}(B_{s})^{2}\big{]}\,d\mu(s)=\int_{0}^{1}s(1-s)\cdot\frac{1}{s(1-s)}\,ds=1<\infty.

∥ K_{B} ∥_{L^{2} ([0, 1]^{2}, μ^{2})}^{2} = \int_{0}^{1} \int_{0}^{1} (min (s, t) - s t)^{2} d μ (s) d μ (t) = \frac{1}{3} π^{2} - 3 < \infty.

∥ K_{B} ∥_{L^{2} ([0, 1]^{2}, μ^{2})}^{2} = \int_{0}^{1} \int_{0}^{1} (min (s, t) - s t)^{2} d μ (s) d μ (t) = \frac{1}{3} π^{2} - 3 < \infty.

(T_{K} f) (t) := \int_{0}^{1} K_{B} (s, t) f (s) d μ (s),

(T_{K} f) (t) := \int_{0}^{1} K_{B} (s, t) f (s) d μ (s),

\int_{0}^{1} ∣ k_{B} (x) ∣ d μ (x) = \int_{0}^{1} x (1 - x) \cdot \frac{1}{x ( 1 - x )} d x = 1 < \infty.

\int_{0}^{1} ∣ k_{B} (x) ∣ d μ (x) = \int_{0}^{1} x (1 - x) \cdot \frac{1}{x ( 1 - x )} d x = 1 < \infty.

K_{B} (s, t) = k = 1 \sum \infty λ_{k} e_{k} (s) e_{k} (t),

K_{B} (s, t) = k = 1 \sum \infty λ_{k} e_{k} (s) e_{k} (t),

\int_{0}^{1} \frac{min ( s , t ) - s t}{s ( 1 - s )} e_{k} (s) d s = λ_{k} e_{k} (t) .

\int_{0}^{1} \frac{min ( s , t ) - s t}{s ( 1 - s )} e_{k} (s) d s = λ_{k} e_{k} (t) .

t (1 - t) λ_{k} e_{k}^{''} (t) + e_{k} (t) = 0.

t (1 - t) λ_{k} e_{k}^{''} (t) + e_{k} (t) = 0.

t (1 - t) \frac{d ^{2}}{d t ^{2}} (e_{k}^{'}) + (1 - 2 t) \frac{d}{d t} (e_{k}^{'}) + \frac{1}{λ _{k}} e_{k}^{'} = 0.

t (1 - t) \frac{d ^{2}}{d t ^{2}} (e_{k}^{'}) + (1 - 2 t) \frac{d}{d t} (e_{k}^{'}) + \frac{1}{λ _{k}} e_{k}^{'} = 0.

y_{k} (x) := e_{k}^{'} (\frac{1}{2} (1 + x)) .

y_{k} (x) := e_{k}^{'} (\frac{1}{2} (1 + x)) .

(1 - x^{2}) y_{k}^{''} (x) - 2 x y_{k}^{'} (x) + \frac{1}{λ _{k}} y_{k} (x) = 0.

(1 - x^{2}) y_{k}^{''} (x) - 2 x y_{k}^{'} (x) + \frac{1}{λ _{k}} y_{k} (x) = 0.

I_{k} := \int_{0}^{1} B_{t} \cdot \frac{e _{k} ( t )}{t ( 1 - t )} d t .

I_{k} := \int_{0}^{1} B_{t} \cdot \frac{e _{k} ( t )}{t ( 1 - t )} d t .

E [I_{k}]

E [I_{k}]

E [I_{i} I_{j}]

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

james-m-foster/igbm-simulation
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

An optimal polynomial approximation of

Brownian motion

James Foster††Mathematical Institute, University of Oxford, Woodstock Road, Oxford, OX2 6GG, UK.

[email protected], [email protected], [email protected]. Research supported by the Engineering and Physical Sciences Research Council [EP/N509711/1].

Terry Lyons Supported by the EPSRC grant DATASIG, Alan Turing Institute and Oxford-Man Institute.

Harald Oberhauser††footnotemark:

Abstract

In this paper, we will present a strong (or pathwise) approximation of standard Brownian motion by a class of orthogonal polynomials. The coefficients that are obtained from the expansion of Brownian motion in this polynomial basis are independent Gaussian random variables. Therefore it is practical (requires $N$ independent Gaussian coefficients) to generate an approximate sample path of Brownian motion that respects integration of polynomials with degree less than $N$ . Moreover, since these orthogonal polynomials appear naturally as eigenfunctions of the Brownian bridge covariance function, the proposed approximation is optimal in a certain weighted $L^{2}(\mathbb{P})$ sense.

In addition, discretizing Brownian paths as piecewise parabolas gives a locally higher order numerical method for stochastic differential equations (SDEs) when compared to the piecewise linear approach. We shall demonstrate these ideas by simulating Inhomogeneous Geometric Brownian Motion (IGBM). This numerical example will also illustrate the deficiencies of the piecewise parabola approximation when compared to a new version of the asymptotically efficient log-ODE (or Castell-Gaines) method.

keywords:

Brownian motion, polynomial approximation, numerical methods for SDEs

AMS:

41A10, 60J65, 60L90, 65C30

1 Introduction

Brownian motion is a central object for modelling real-world systems that evolve under the influence of random perturbations [1]. In applications where methods discretize Brownian motion, usually only increments of the path are generated [2]. In this setting, the best $L^{2}(\mathbb{P})$ approximation of Brownian motion that is measurable with respect to these increments is given by the piecewise linear path that agrees on discretization points [3]. This motivates the following natural question: Are there better discrete approximations of Brownian motion than piecewise linear? The next simplest approximant would be a piecewise polynomial, though it is not clear whether this would be advantageous for tackling problems such as SDE simulation. This paper can be viewed as a logical continuation of [4], where a polynomial wavelet representation of Brownian motion was proposed. These wavelets were constructed to capture certain “geometrical features” of the path, namely the integrals of the Brownian motion against monomials. We shall investigate the practical applications of these polynomials and their geometrical features in the numerical analysis of SDEs.

The paper is organised as follows. In Section 2, we shall state and prove the main result of the paper (Theorem 5). This will be a Karhunen-Loève theorem for the Brownian bridge, where the orthogonal functions used in the approximation are polynomials. Furthermore, we shall explicitly show that each basis function is proportional to a shifted $(\alpha,\beta)$ -Jacobi polynomial but with the nonstandard exponents $\alpha=\beta=-1$ . This enables us to construct these orthogonal polynomials using recurrence relations, or as the difference of two shifted Legendre polynomials whose degrees differ by two. The resulting polynomial expansion of Brownian motion was independently discovered by Habermann in [5], where a sharp $L^{2}(\mathbb{P})$ convergence rate of $O\big{(}\frac{1}{\sqrt{n}}\big{)}$ is established111A Matlab demonstration can be found at chebfun.org/examples/stats/RandomPolynomials.html. In Section 3, we shall investigate some significant consequences of the main theorem.

Theorem 1.

Let $W$ denote a standard real-valued Brownian motion on $[\hskip 0.28453pt0,1]$ . Let $W^{n}$ be the unique $n$ -th degree random polynomial with a root at [math] and satisfying

[TABLE]

Then $W=\hskip 0.7113ptW^{n}+Z^{n}$ , where $Z^{n}$ is a centered Gaussian process independent of $W^{n}$ .

The above theorem has a simple yet striking conclusion, namely that polynomials can be unbiased approximants of Brownian motion. In addition, the first non-trivial case ( $n=2$ ) already has interesting applications within the numerical analysis of SDEs. One reason is that parabolas can capture the “ space-time area” of Brownian motion.

Therefore discretizing Brownian motion using a piecewise parabola gives a locally high order methodology for numerically solving one-dimensional SDEs. However, since certain triple iterated integrals of Brownian motion and time are partially matched by these parabolas, we expect this method to have only an $O(h)$ rate of convergence (where $h$ denotes the step size used). This gives motivation for the following theorem:

Theorem 2.

Let $\wideparen{W}$ be the (unique) quadratic polynomial with a root at [math] and

[TABLE]

Then the following third order iterated integral of Brownian motion can be estimated:

[TABLE]

The above theorem can be directly incorporated into the stochastic Taylor method as well as the log-ODE or Castell-Gaines method (see [6], [7]). We will show that by estimating this non-trivial iterated integral with its conditional expectation, we can design numerical methods that enjoy high orders of both strong and weak convergence. Specifically, for a general SDE that is driven by a one-dimensional Brownian motion and governed by sufficiently regular vector fields (smooth with bounded derivatives), the numerical methods that correctly utilize the above conditional expectation will have a strong convergence rate of $O(h^{\frac{3}{2}})$ as well as a weak convergence rate of $O(h^{2})$ .

These high orders of convergence can also be achieved in the multidimensional setting provided the vector fields governing the SDE satisfy certain commutativity conditions. For example, this estimator has applications for simulating SDEs with additive noise:

[TABLE]

where $f$ is a smooth vector field on $\mathbb{R}^{d}$ , $\sigma>0$ is constant and $W$ is now $d$ -dimensional. By considering Theorem 2, we expect that $y_{t}$ is well approximated (for small $t$ ) by

[TABLE]

where $\wideparen{y}$ denotes the solution of the below ODE driven by a “ Brownian parabola” $\wideparen{W}$ ,

[TABLE]

This parabola-driven ODE can then be discretized using a three-stage Runge-Kutta method and the resulting SDE approximation shall be investigated in a future work.

Since these methods are based on the conditional expectation given by Theorem 2, they are designed to minimize the leading error term within local Taylor expansions. This sense of optimality is conceptually similar to that of the asymptotically efficient SDE approximations developed by Clark [8], Newton [9, 10] and Castell & Gaines [6]. The key difference is that we are employing additional integral information about $W$ . Hence, this line of research could provide a further insight into the approximation of Itô integrals using linear path information, where there already are a number of results concerning the computational complexity of methods (see [11], [12] and [13]).

Most notably, Tang and Xiao [11] consider the same triple iterated integral as in (2) and present an asymptotically optimal approximation that performs well when a limited number of random variables are used (see Table 2 for these numerical results). Whilst there are other senses of optimality (such as those discussed in [14] and [15]) that could be used when analysing the proposed approximations of Brownian motion and SDE solutions, we shall estimate errors in an $L^{2}(\mathbb{P})$ sense throughout the paper. In particular, we can apply the main result to quantify the error of the new estimator.

Theorem 3.

Using the same notation as before, we have the following variance:

[TABLE]

In Section 4, we demonstrate the applicability of these ideas to SDE simulation through various discretizations of Inhomogeneous Geometric Brownian Motion (IGBM)

[TABLE]

where $a\geq 0$ and $b\in\mathbb{R}$ are the mean reversion parameters and $\sigma\geq 0$ is the volatility.

In mathematical finance, IGBM is an example of a short rate model that can be both mean-reverting and non-negative. It is therefore suitable for modelling interest rates, stochastic volatilities and default intensities [16]. From a mathematical viewpoint, IGBM is one of the simplest SDEs that has no known method of exact simulation [17]. By incorporating the ideas provided by the main theorem into the log-ODE method, we will produce a state-of-the-art numerical approximation of IGBM. Although the vector fields for IGBM are not bounded, our numerical evidence indicates that the method has a strong convergence rate of $O(h^{\frac{3}{2}})$ and a weak convergence rate of $O(h^{2})$ .

1.1 Notation

Below is some of the notation that is used throughout the paper.

2 Main result

It was shown in [4] that Brownian motion can be generated using Alpert-Rokhlin multiwavelets (see [18]). The mother functions that generate this wavelet basis are supported on $[\hskip 0.28453pt0,1]$ and are defined using polynomials as follows:

Definition 4 (Alpert-Rokhlin wavelets).

For $q\geq 1$ , define the $q$ functions $\phi^{q,1},\cdots,\phi^{q,q}:[\hskip 0.42677pt0,1\hskip 0.42677pt]\rightarrow\mathbb{R}$ as piecewise polynomials of degree $q-1$ with pieces on $[\hskip 0.42677pt0,\frac{1}{2}\hskip 0.42677pt]$ , $[\hskip 0.42677pt\frac{1}{2},1\hskip 0.42677pt]$ that satisfy the following conditions for all $p\in\left\{1,\,\cdots\,,q\hskip 0.42677pt\right\}$ and $t\in[\hskip 0.42677pt0,\frac{1}{2}\hskip 0.42677pt)$ $:$

[TABLE]

The Alpert-Rokhlin multiwavelets of order $q$ can now be generated by translating and scaling the mother functions $\phi^{q,p}$ .

[TABLE]

for $n\geq 0$ and $k\in\left\{0,\cdots,2^{n}-1\right\}$ .

Whilst our results will not be presented in terms of the above wavelets, we shall see that the polynomials of interest are directly related to conditions (3), (4) and (5). The main result of this paper gives an effective method for approximating sample paths of Brownian motion by a class of Jacobi-like polynomials. The proof is based on the interpretation of these polynomials as eigenfunctions of an integral operator defined by the Brownian bridge covariance function222The Brownian bridge is the centered Gaussian process with covariance $K_{B}(s,t)=\min(s,t)-st$ .. These orthogonal polynomials, which lie at the heart of this paper, will also help us interpret the geometrical features that certain normally distributed iterated integrals encode about the Brownian path.

Theorem 5 (A polynomial Karhunen-Loève theorem for the Brownian bridge).

Let $B$ denote a Brownian bridge on $[\hskip 0.28453pt0,1]$ and consider the Borel measure $\mu$ given by

[TABLE]

Then there exists a family of orthogonal polynomials $\{e_{k}\}_{k\geq 1}$ with $\deg\left(e_{k}\right)=k+1$ and

[TABLE]

with $\delta_{ij}$ denoting the Kronecker delta, such that $B$ admits the following representation

[TABLE]

where $\{I_{k}\}$ is the collection of independent centered Gaussian random variables with

[TABLE]

and

[TABLE]

Furthermore, $\left\{e_{k}\right\}$ is an optimal orthonormal basis of $L^{2}([\hskip 0.28453pt0,1],\mu)$ for approximating $B$ by truncated series expansions with respect to the following weighted $L^{2}(\mathbb{P})$ norm

[TABLE]

where $X$ is a square $\mu$ -integrable process.

Proof.

Our argument is that of the Karhunen-Loève theorem in general $L^{2}$ spaces. Note that $B$ is a square $\mu$ -integrable process as

[TABLE]

Let $K_{B}$ denote the covariance function for the standard Brownian bridge on $[\hskip 0.28453pt0,1]$ . Since $K_{B}(s,t)=\min(s,t)-st$ , it can be shown by direct calculation that $K_{B}$ satisfies

[TABLE]

Hence, it follows that the integral operator $T_{K}:L^{2}([\hskip 0.28453pt0,1],\,\mu)\rightarrow L^{2}([\hskip 0.28453pt0,1],\,\mu)$ given by

[TABLE]

is well-defined and continuous. In addition, the variance function $k_{B}(x):=K_{B}(x,x)$ for $x\in[\hskip 0.28453pt0,1]$ is $\mu$ -integrable as

[TABLE]

Therefore, we can apply Mercer’s theorem for kernels on general $L^{2}$ spaces (see [19]). It then follows from Mercer’s theorem that there exists an orthonormal set $\{e_{k}\}_{k\geq 1}$ of $L^{2}([\hskip 0.28453pt0,1],\,\mu)$ consisting of eigenfunctions of $T_{K}$ such that the corresponding sequence of eigenvalues $\{\lambda_{k}\}_{k\geq 1}$ is non-negative. Moreover, the eigenfunctions corresponding to non-zero eigenvalues are continuous on $[\hskip 0.28453pt0,1]$ and the kernel $K_{B}$ has the representation

[TABLE]

where the series (8) converges absolutely and uniformly on compact subsets of $[\hskip 0.28453pt0,1]$ .

In the next part of the proof, we will see that each $e_{k}$ is a polynomial of degree $k+1$ . As each $e_{k}$ is an eigenfunction of $T_{K}$ , we have

[TABLE]

Since $e_{k}\in L^{2}([\hskip 0.28453pt0,1],\,\mu)$ , it follows that $e_{k}(0)=0$ and $e_{k}(1)=0$ for each $k\geq 1$ . Therefore by using the Leibniz integral rule to twice differentiate both sides of (9) and then multiplying by $t(1-t)$ , we observe that $e_{k}$ satisfies the differential equation

[TABLE]

Since $e_{k}\neq 0$ , we have that $\lambda_{k}\neq 0$ . Differentiating the LHS of the ODE (10) produces

[TABLE]

For $x\in[-1,1]$ , we define the function

[TABLE]

Thus $y_{k}$ satisfies the following differential equation

[TABLE]

Remarkably, this is the Legendre differential equation [20]. It then follows using classical Sturm-Liouville theory that $\frac{1}{\lambda_{k}}=k(k+1)$ and $y_{k}$ is proportional to the $k$ -th Legendre polynomial. Therefore, the derivative $e_{k}^{\prime}$ will be a constant multiple of the $k$ -th shifted Legendre polynomial and hence each $e_{k}$ is a polynomial of degree $k+1$ .

We can now define the following integrals for $k\geq 1$ ,

[TABLE]

It follows from Fubini’s theorem that

[TABLE]

Since each $I_{k}$ is defined by a linear functional on the same Gaussian process $B$ , we see from the above that $\{I_{k}\}$ is a collection of uncorrelated (and therefore independent) Gaussian random variables with

[TABLE]

Finally, the $L^{2}(\mathbb{P})$ convergence we require follows as

[TABLE]

which converges to [math] by Mercer’s theorem (8).

All that remains is to prove optimality for the truncated series expansions of (6). Let $\left\{f_{k}\right\}_{k\geq 1}$ denote an orthonormal basis of $L^{2}([\hskip 0.28453pt0,1],\mu)$ such that

[TABLE]

For $n\geq 1$ , we consider an error process associated with the above: $r_{n}:=\sum\limits_{k=n+1}^{\infty}J_{k}f_{k}\,$ .

Then the square $L^{2}(\mathbb{P})$ norm of the $n$ -th error process admits the following expansion,

[TABLE]

Integrating the above with respect to $\mu$ and using the orthogonality of $\{f_{k}\}_{k\geq 1}$ gives

[TABLE]

Note that any optimal orthonormal basis of $L^{2}([\hskip 0.28453pt0,1],\mu)$ solves the following problem:

[TABLE]

By introducing Lagrange multipliers $\nu_{k}$ , we wish to find functions $\{f_{k}\}$ that minimize

[TABLE]

We will now consider the following square integrable functions, defined for $s,t\in\left(0,1\right)$ :

[TABLE]

Therefore it is enough to find a family of functions $\{\tilde{f}_{k}\}$ in $L^{2}([\hskip 0.28453pt0,1])$ which minimizes

[TABLE]

To find a minimizer, we set the functional derivative of $\tilde{E}_{n}$ with respect to $\tilde{f}_{k}$ to zero.

[TABLE]

By using the definitions of $\tilde{f}_{k}$ and $\tilde{K}_{B}$ , it is trivial to show the above is equivalent to

[TABLE]

which is satisfied if and only if $f_{k}$ are eigenfunctions of $T_{K}$ . ∎

This result can naturally be extended to express Brownian motion using polynomials.

Theorem 6.

If $W$ is a standard Brownian motion and $B$ is the associated bridge process on $[\hskip 0.28453pt0,1]$ , then by Theorem 5, we have the below representation of $W$ :

[TABLE]

where $e_{0}(t):=t\hskip 0.7113pt$ for $\hskip 0.7113ptt\in[\hskip 0.28453pt0,1]$ , and the random variables $\{I_{k}\}$ are independent of $W_{1}$ .

In the rest of this section, we shall study the key objects introduced in Theorem 5. Since each orthogonal polynomial lies in $L^{2}([\hskip 0.28453pt0,1],\,\mu)$ , it must have roots at [math] and $1$ . Therefore $e_{k}\cdot\frac{1}{t(1-t)}$ is itself a polynomial but with degree $k-1$ , and one can repeatedly apply the integration by parts formula to the stochastic integrals $\{I_{k}\}$ defined by (7). This enables us to express each $I_{k}$ in terms of iterated integrals of Brownian motion. Moreover, as $e_{k}\cdot\frac{1}{t(1-t)}$ has precisely $k-2$ non-zero derivatives, the highest order iterated integral that is required to fully describe $I_{k}$ is $\int_{0<s_{1}<\cdots<s_{k}<1}B_{s_{1}}\,ds_{1}\,\cdots\,ds_{k}\hskip 0.35565pt$ .

So by applying the integration by parts formula as above, we can construct a lower triangular $n\times n$ matrix $M_{n}$ with non-zero diagonal entries that characterizes the relationship between $\{I_{k}\}_{1\leq k\leq n}$ and a set of $n$ iterated integrals of Brownian motion.

Hence, for $n\geq 1$ , we can express the $n$ independent Gaussian integrals $\{I_{k}\}_{1\leq k\leq n}$ as

[TABLE]

Since $M_{n}$ is an invertible matrix, it follows that the column vectors appearing in (13) both encode the same information about the Brownian bridge. This enables us to establish a connection between Brownian motion, iterated integrals and polynomials.

Theorem 7.

Consider the below conditional expectation of Brownian motion,

[TABLE]

where $t\in[\hskip 0.28453pt0,1]$ . Then $W^{n}$ is the unique polynomial of degree $n$ with a root at [math] that matches the increment $W_{1}$ and $n-1$ iterated time integrals of the path $\,W$ given by:

[TABLE]

Proof.

It is a direct consequence of (13) that $W_{t}^{n}=\mathbb{E}[W_{t}\,|\,W_{1},I_{1},\cdots,I_{n-1}]$ . Hence by (12) and independence of the random variables $\{W_{1},I_{1},\cdots\}$ , we have that

[TABLE]

Thus $W^{n}$ is indeed a polynomial of degree $n$ with a root at [math] and that matches the increment of the Brownian path. Without loss of generality we can now assume $n\geq 2$ . All that remains is to argue $W^{n}$ matches the $n-1$ iterated integrals given in (15). Using the orthogonality of $\{e_{k}\}$ , it follows directly from (16) that for $1\leq k\leq n-1$ :

[TABLE]

Hence $W^{n}$ matches the integrals of Brownian motion against polynomials with degree at most $n-1$ . By the same argument used in the derivation of (13), it follows that $W^{n}$ matches the various iterated time integrals given in the statement of the theorem. The uniqueness of $W^{n}$ is now a consequence of having $n+1$ different constraints. ∎

2.1 Properties of orthogonal polynomials

Although Theorem 5 and Theorem 6 are interesting results from a theoretical point of view, both lack an explicit construction of the polynomials $\left\{e_{k}\right\}$ that could be implemented in practice. On the other hand, it was shown that the defining eigenfunction property of each $e_{k}$ implies that its derivative $e_{k}^{\prime}$ is proportional to the $k$ -th shifted Legendre polynomial. Hence the family $\left\{e_{k}\right\}$ is the (normalized) shifted $(\alpha,\beta)$ -Jacobi polynomials but with $\alpha=\beta=-1$ . Since Jacobi polynomials are typically studied with $\alpha,\beta>-1$ , it is necessary to show there exists a well-defined limit when the parameters approach $-1$ .

Definition 8.

For $k\geq 2$ , the $k$ -th degree $(\text{-}1,\text{-}1)$ -Jacobi polynomial $P_{k}^{(\text{-}1,\text{-}1)}$ is

[TABLE]

Naturally, for this definition to be unambiguous, we will require the following lemma.

Lemma 9.

Let $P_{k}^{(\alpha,\beta)}$ denote the $k$ -th degree $(\alpha,\beta)$ -Jacobi polynomial on $[-1,1]$ . Then for $k\geq 2$ , there exists a real-valued polynomial $P_{k}$ such that $\big{\|}P_{k}-P_{k}^{(\alpha,\beta)}\big{\|}_{\infty}\rightarrow 0$ as $\alpha,\beta\rightarrow-1^{+}$ .

Proof.

Below is an identity for Jacobi polynomials, with $\alpha,\beta>-1$ , given in [21].

[TABLE]

Therefore, we shall define the $k$ -th degree polynomial $P_{k}$ over the interval $[-1,1]$ by

[TABLE]

It is straightforward to verify that $\lim_{\hskip 0.7113pt\alpha,\beta\,\rightarrow\,0}\|\hskip 1.42262ptP_{n}^{(\alpha,\beta)}-\hskip 0.7113ptP_{n}^{(0,0)}\hskip 0.7113pt\|_{\infty}=0$ for $n\in\{0,1\}$ . So by induction and the recurrence relation for Jacobi polynomials (see [21]), we have:

[TABLE]

for all $n\geq 0$ . Hence by the dominated convergence theorem with (17) and (18), it follows that $P_{k}^{(\alpha,\beta)}$ will converge pointwise to $P_{k}$ as $\alpha,\beta\,\rightarrow\,-1$ for each $k\geq 1$ . Finally, the result follows as $P_{k}^{(\alpha,\beta)}$ and $P_{k}$ are always polynomials with degree $k$ . ∎

Using the above definition for $(\text{-}1,\text{-}1)$ -Jacobi polynomials, we can give an explicit formula for the orthonormal polynomials $\{e_{k}\}_{k\geq 1}$ appearing in Theorems 5 and 6.

Theorem 10.

Suppose each $e_{k}$ has a positive leading coefficient. Then for $k\geq 1$ ,

[TABLE]

Proof.

The following identity for $(\alpha,\beta)$ -Jacobi polynomials is stated in [21]:

[TABLE]

for $n\geq 1$ and $\alpha,\beta>-1$ . Applying the change of variables, $t:=\frac{1}{2}(x+1)$ , we have

[TABLE]

for $n\geq 1$ and $\alpha,\beta>-1$ . By definition 8, taking the limit $\alpha,\beta\rightarrow-1^{+}$ will yield

[TABLE]

Therefore by setting $k:=n-1$ , we have

[TABLE]

Recall that $e_{k}^{\prime}$ is proportional to the $k$ -th shifted Legendre polynomial $P_{k}^{(0,0)}(2t-1)$ . Similarly, we saw in the proof of Lemma 9 that the derivative of $P_{k+1}^{(\text{-}1,\text{-}1)}$ is $\frac{k}{2}\hskip 0.7113ptP_{k}^{(0,0)}$ . As $e_{k}$ and $P_{k+1}^{(\text{-}1,\text{-}1)}$ are zero at their respective endpoints, we have that each $e_{k}$ must be proportional to $P_{k+1}^{(\text{-}1,\text{-}1)}(2t-1)$ . The result now follows from the above calculations. ∎

Having identified an explicit formula for the eigenfunctions $\{e_{k}\}$ in (19), we shall now describe two methodologies for computing the Jacobi-like polynomials $\big{\{}P_{k}^{(\text{-}1,\text{-}1)}\big{\}}$ .

The first approach is to use the three-term recurrence relation in the theorem below.

Theorem 11 (Recurrence relation for $(\text{-}1,\text{-}1)$ -Jacobi polynomials).

For $n\geq 2$ ,

[TABLE]

where the initial polynomials are given by

[TABLE]

Proof.

The below recurrence relation for Jacobi polynomials is presented in [21],

[TABLE]

for $k\geq 1$ and $\alpha,\beta>-1$ . By definition 8, it is possible to take the limit $\alpha,\beta\rightarrow 1^{+}$ provided that $k\geq 3$ . Therefore, taking this limit and setting $n=k-1\geq 2$ produces

[TABLE]

for $n\geq 2$ . Dividing the above by $4n$ gives the required recurrence relation (20). Finally the below formula, stated in [21], can be used to compute $P_{2}^{(\text{-}1,\text{-}1)}$ and $P_{3}^{(\text{-}1,\text{-}1)}$ :

[TABLE]

As before we take $\alpha,\beta\rightarrow 1^{+}$ in the above to obtain an explicit formula for $P_{n}^{(\text{-}1,\text{-}1)}$ . ∎

In addition to computing these polynomials via a recurrence relation, it is also possible to represent each $P_{n}^{(\text{-}1,\text{-}1)}$ as the difference of two (rescaled) Legendre polynomials. Since the Legendre polynomials already have efficient implementations in the majority of high-level programming languages, this second approach is particularly appealing.

Theorem 12 (Relationship between the Jacobi-like and Legendre polynomials).

For $n\geq 1$ , we have

[TABLE]

where $Q_{k}$ denotes the $k$ -th degree Legendre polynomial defined on $[-1,1]$ .

Proof.

Recall that $\frac{d}{dx}\big{(}P^{(\text{-}1,\text{-}1)}_{n+1}\hskip 0.7113pt\big{)}=\frac{n}{2}P^{(0,0)}_{n}$ for $n\geq 1$ , where $P^{(0,0)}_{n}(=Q_{n})$ is the $n$ -th degree Legendre polynomial. Therefore differentiating both sides of (20) yields

[TABLE]

Hence by simplifying and rearranging the above, we have that for $n\geq 1$ ,

[TABLE]

We see the last term is zero by a recurrence relation for Legendre polynomials [20]. ∎

In addition to viewing the polynomials $\{e_{k}\}$ as orthogonal with respect to the weight function $w(x):=\frac{1}{x(1-x)}$ , we can characterize them via their iterated time integrals. In particular, for $1\leq k\leq n-1$ , it follows from the integration by parts formula that

[TABLE]

Hence for $k\geq 1$ , $e_{k}$ is a polynomial with degree $k+1$ that has roots at [math] and $1$ as well as $k-1$ trivial iterated integrals against time. By additionally specifying the $k$ -th iterated time integral, it is then possible to characterize the $k$ -th polynomial $e_{k}$ .

To conclude this section, we will address the relationship between the orthogonal Jacobi-like polynomials $\{e_{k}\}$ and the Alpert-Rokhlin wavelets given in definition 4. Since each $e_{k}^{\prime}$ is proportional to the $k$ -th shifted Legendre polynomial, the family of polynomials $\{e_{k}^{\prime}\}$ is orthogonal with respect to the standard $L^{2}([\hskip 0.28453pt0,1])$ inner product. This orthogonality is exactly what is needed to satisfy the conditions (4) and (5). Hence for any $q\geq 1$ there exists an Alpert-Rokhlin mother function of order $q$ that is a piecewise polynomial where both pieces can be rescaled and translated to give $e_{q-1}^{\prime}$ .

3 Applications to SDEs

Consider the Stratonovich SDE on the interval $[\hskip 0.28453pt0,T]$

[TABLE]

where $\xi\in\mathbb{R}^{e}$ and $f_{i}$ denote bounded $C^{\infty}$ vector fields on $\mathbb{R}^{e}$ with bounded derivatives. It then follows from the standard Picard iteration argument that there exists a unique strong solution $y$ to (21). An important tool in the numerical analysis of this solution is the stochastic Taylor expansion (see chapter 5 of [22] for a comprehensive review). For the purposes of this paper, we only require the following specific Taylor expansion.

Theorem 13 (High order Stratonovich-Taylor expansion).

Let $y$ denote the unique strong solution to (21) and let $0\leq s\leq t$ . Then $y_{t}$ can be expanded as follows:

[TABLE]

where $h:=t-s$ and the remainder term has the following uniform estimate for $h<1$ ,

[TABLE]

where the constant $C>0$ depends only on the vector fields of the differential equation.

From a numerical perspective, the most challenging terms presented in (22) are those that involve non-trivial third order iterated integrals of Brownian motion and time. Moreover, the most significant source of discretization error that high order numerical methods will experience is generally due to approximating these stochastic integrals. By representing Brownian motion as a (random) polynomial plus independent noise, we shall derive a new optimal and unbiased estimator for these third order integrals.

Theorem 14.

Let $W$ denote a standard real-valued Brownian motion on $[\hskip 0.28453pt0,1]$ . Let $W^{n}$ be the unique $n$ -th degree random polynomial with a root at [math] and satisfying

[TABLE]

Then $W=\hskip 0.7113ptW^{n}+Z^{n}$ , where $Z^{n}$ is a centered Gaussian process independent of $W^{n}$ .

Furthermore, $Z^{n}$ has the following covariance function:

[TABLE]

where $K_{B}$ denotes the standard Brownian bridge covariance function and $\{\lambda_{k}\}$ , $\{e_{k}\}$ are the eigenvalues and eigenfunctions that were defined in the proof of Theorem 5.

Proof.

It follows from the integration by parts formula that $W^{n}$ matches the increment and $n-1$ iterated time integrals of Brownian motion that appear in (15). Hence $W^{n}$ is also the polynomial defined in Theorem 7 and $W=W^{n}+Z^{n}$ where

[TABLE]

Then by Theorem 5, $Z^{n}$ is a centered Gaussian process that is independent of $W^{n}$ . In addition, the covariance function defining $Z^{n}$ can be directly computed as follows:

[TABLE]

Note that the final line is achieved using the representation of $K_{B}$ given by (8). ∎

The above theorem has an interesting conclusion, namely that there exist unbiased polynomial approximants of Brownian motion for which the error process can be independently estimated in an $L^{2}(\mathbb{P})$ sense. In particular, this theorem already has numerical applications in the case when $n=2$ and motivates the following definitions:

Definition 15.

The standard Brownian parabola $\,\wideparen{W}$ is the unique quadratic polynomial on $[\hskip 0.28453pt0,1]$ with a root at [math] and satisfying

[TABLE]

Definition 16.

The standard Brownian arch $\,Z$ is the process $Z:=W-\wideparen{W}$ . By Theorem 14, $Z$ is the centered Gaussian process on $[\hskip 0.28453pt0,1]$ with covariance function

[TABLE]

Definition 17.

The rescaled space-time Lévy area of Brownian motion over an interval $[s,t]$ with length $h$ encodes the signed area of the associated bridge process,

[TABLE]

Remark 3.18.

Since $e_{1}(t)=\sqrt{6}\,t(1-t)$ , we have that $H_{0,1}$ corresponds to $\frac{\sqrt{6}}{6}I_{1}$ as defined in Theorem 5. Thus, $H_{s,t}\sim\mathcal{N}\left(0,\frac{1}{12}h\right)$ and $H_{s,t}$ is independent of $\hskip 0.7113ptW_{s,t}\hskip 0.7113pt$ .

By applying the natural scaling of Brownian motion, one can define the Brownian parabola and Brownian arch processes over any interval $[s,t]$ with finite size $h=t-s$ . Whilst the Brownian arch can be viewed in a similar light to the Brownian bridge, there are clear qualitative and quantitative differences in their covariance functions. In particular, the Brownian arch has less variance at its midpoint compared to most points in $[s,t]$ $\big{(}$ by which we mean that $|\{u\in[s,t]:\operatorname{Var}(Z_{u})\leq\operatorname{Var}(Z_{\frac{1}{2}(s+t)})\}|<\frac{1}{2}h\big{)}$ . This is in contrast to the Brownian bridge, which has most variance at its midpoint. In fact, the Brownian parabola gives a relatively uniform estimate of the original path.

Using these new definitions, we can study the high order integrals appearing in (22).

Theorem 3.19 (Conditional expectation of a non-trivial Brownian time integral).

[TABLE]

Proof 3.20.

By the natural Brownian scaling it is enough to prove the result on $[\hskip 0.28453pt0,1]$ . Recall that $W=\wideparen{W}+Z$ where the parabola $\wideparen{W}$ is completely determined by $\left(W_{1},H_{1}\right)$ and $Z$ is independent of $\left(W_{1},H_{1}\right)$ . This leads to a decomposition for the LHS of (24).

[TABLE]

The result now follows by evaluating the above integrals.

The above theorem has practical applications for SDE simulation as $W_{s,t}$ and $H_{s,t}$ are independent Gaussian random variables and can be easily generated or approximated. That said, we should first discuss how the iterated integrals within (22) are connected.

Definition 3.21.

The space-space-time Lévy area of Brownian motion over an interval $[s,t]$ is defined as

[TABLE]

We can interpret $L_{s,t}$ as an area between the processes $\{W_{s,u}\}_{u\in[s,t]}$ and $\{H_{s,u}\}_{u\in[s,t]}$ . Moreover, rough path theory provides an algebraic structure (called the log-signature) that relates $(W_{s,t},H_{s,t},L_{s,t})$ to the iterated integrals of space-time Brownian motion and ultimately to SDE solutions via the log-ODE method (see [23] for an overview). For our purposes, it is enough to give formulae relating these Lévy areas to integrals.

Theorem 3.22.

Let $H_{s,t}$ and $L_{s,t}$ denote the Lévy areas of Brownian motion given by definitions 17 and 3.21 respectively. Then the following integral relationships hold,

[TABLE]

Proof 3.23.

The result follows from numerous applications of integration by parts.

We can now present the new unbiased estimator for third order iterated integrals of Brownian motion and time. The proposed estimator is fast to compute and the best $L^{2}(\mathbb{P})$ approximation of these integrals that is measurable with respect to $\left(W_{s,t},H_{s,t}\right)$ .

Theorem 3.24 (Conditional moments of Brownian space-space-time Lévy area).

[TABLE]

Proof 3.25.

The expectation (25) is simply a consequence of Theorems 3.19 and 3.22. Without loss of generality, we will consider the above conditional variance on $[\hskip 0.28453pt0,1]$ . Since $\wideparen{W}$ is determined using the increment $W_{1}$ and space-time Lévy area $H_{1}$ , we have

[TABLE]

Recall $Z=\sum\limits_{k=2}^{\infty}I_{k}\hskip 0.7113pte_{k}$ where $\left\{I_{k}\right\}$ are independent centered Gaussian random variables.

In particular, this means that $Z$ and $-Z$ have the same law. Therefore, we have that

[TABLE]

The remaining two terms were resolved with assistance from Wolfram Mathematica.

[TABLE]

By Theorem 3.22, the above gives an explicit formula for the conditional variance (26).

[TABLE]

By the natural Brownian scaling, the result on the interval $[s,t]$ directly follows.

Remark 3.26.

The conditional variance (26) allows one to estimate local $L^{2}(\mathbb{P})$ errors for certain numerical methods and thus may be useful when choosing step sizes.

Therefore in order to propagate a numerical solution of (21) over an interval $[s,t]$ , one can generate $\left(W_{s,t},H_{s,t}\right)$ exactly and then approximate $L_{s,t}$ using Theorem 3.24. However, there are many numerical methods that could be used to solve a given SDE.

3.1 Examples of ODE methods

We will consider the following two methods:

Definition 3.27 (High order log-ODE method).

For a fixed number of steps $N$ we can construct a numerical solution $\left\{Y_{k}\right\}_{0\leq k\leq N}$ of (21) by setting $Y_{0}:=\xi$ and for each $k\in[0\mathrel{{.}\,{.}}\nobreak N-1]$ , defining $Y_{k+1}$ to be the solution at $u=1$ of the following ODE:

[TABLE]

where $h:=\frac{T}{N}$ , $t_{k}:=kh$ and $[\,\cdot\hskip 1.42262pt,\cdot\,]$ denotes the standard Lie bracket of vector fields.

Definition 3.28 (The parabola-ODE method).

For a fixed number of steps $N$ we can construct a numerical solution $\left\{Y_{k}\right\}_{0\leq k\leq N}$ of (21) by setting $Y_{0}:=\xi$ and for each $k\in[0\mathrel{{.}\,{.}}\nobreak N-1]$ , defining $Y_{k+1}$ to be the solution at $u=1$ of the following ODE:

[TABLE]

where $h:=\frac{T}{N}$ and $t_{k}:=kh$ .

In both numerical methods the true solution $y$ at time $t_{k}$ can be approximated by $Y_{k}$ . Whilst there are different ways of interpolating between the successive approximations $Y_{k}$ and $Y_{k+1}$ , for this paper we will simply interpolate between such points linearly. To analyse the above methods, we shall first note the key differences between them. The first important distinction between the two methods is a purely practical one. Although both of these methods involve computing a numerical solution of an ODE, the parabola method does not require one to explicitly resolve vector field derivatives. The second significant difference can be seen in the Taylor expansions of the methods.

Theorem 3.29.

Let $Y^{\text{log}}$ be the one-step approximation defined by the log-ODE method on the interval $[s,t]$ with initial value $Y_{0}^{\text{log}}=y_{s}$ . Then for sufficiently small $h$

[TABLE]

Similarly, let $Y^{\text{para}}$ denote the one-step approximation given by the parabola-ODE method on the interval $[s,t]$ with the same initial value. Then for sufficiently small $h$

[TABLE]

Note that $O(h^{\frac{5}{2}})$ denotes terms which can be estimated in an $L^{2}(\mathbb{P})$ sense as in (23).

Proof 3.30.

In order to derive (29), we must compute the Taylor expansion of (27). Let $F$ denote the vector field defined in (27) that was constructed from $f_{0}$ and $f_{1}$ . Then $F$ is smooth, and it follows from the classical Taylor’s theorem for ODEs that

[TABLE]

We shall first consider the remainder term, which can be directly estimated as follows:

[TABLE]

*One can define the degree of each term in the above Taylor expansion by counting the number of times functions from $\{F,F^{\prime},F^{\prime\prime},\cdots\}$ appear. Therefore, after expanding the fifth derivative of $Y^{\text{log}}$ we can see that the remainder term has a degree of five. Since the largest component of $F$ is $f_{1}(\cdot)\hskip 0.7113ptW_{s,t}$ , both $F$ and its derivatives are $O(h^{\frac{1}{2}})$ . Hence the remainder term in the above Taylor expansion will be $O(h^{\frac{5}{2}})$ as in (23). Moreover, the only terms of degree four that are not $O(h^{\frac{5}{2}})$ are those involving $W_{s,t}^{4}$ .

It is now enough to analyse just the terms appearing in the first line of the expansion. By substituting the formula for $F$ given by (27) into the first line and then rearranging the resulting terms, we can obtain a Taylor expansion for $Y_{1}^{\text{log}}$ that resembles (22) as*

[TABLE]

Therefore, by summing the above formulae for $F$ (and its derivatives) we can derive an expansion of $Y_{1}^{\text{log}}$ in terms of $f_{0},f_{1}$ and $\big{(}h,W_{s,t},H_{s,t}\big{)}$ that has an $O\big{(}h^{\frac{5}{2}}\big{)}$ remainder. By comparing this with the stochastic Taylor expansion (22), the result (29) follows.

Arguing (30) is fairly straightforward and does not require extensive computations. Using the substitution $\wideparen{Y}_{u}=z_{\frac{1}{h}(u-s)}$ for $u\in[s,t]$ , the ODE (28) can be rewritten as

[TABLE]

where $\wideparen{W}$ denotes the Brownian parabola defined by $(W_{s,t},H_{s,t})$ on the interval $[s,t]$ .

By emulating the derivation of the Stratonovich-Taylor expansion (22), it is possible to Taylor expand (31) in the same fashion. The only difference is that Stratonovich integrals with respect to $W$ are replaced with Riemann-Stieltjes integrals against $\wideparen{W}$ .

In particular, by the change-of-variable formula for ODEs (exercise 3.17) given in **[24]**, we see that the remainder term of such a Taylor expansion will have the below form:

[TABLE]

where we have identified an additional “ zero” coordinate of $\wideparen{W}$ with time, $\wideparen{W}_{t}^{0}:=t$ , and for each index $(i_{1},\cdots i_{n})$ , the function $f_{i_{1},\hskip 0.7113pt\cdots,\hskip 0.7113pti_{n}}:\mathbb{R}^{d}\rightarrow\mathbb{R}^{d}$ consists of finitely many compositions of $f_{0},f_{1}$ along with their derivatives (and thus is Lipschitz continuous).

Therefore each term in the expansion of (31) can be estimated in $L^{2}(\mathbb{P})$ by applying the natural Brownian scaling to the corresponding iterated integral of $\wideparen{W}$ with time. As before, the largest differences are the $O(h^{2})$ terms involving third order integrals. Fortunately, iterated integrals of the Brownian parabola can be computed explicitly:

[TABLE]

The result (30) is now a direct consequence of Theorem 3.22 along with the above.

Theorem 3.29 shows that both methods give a one-step approximation error of $O(h^{2})$ . This means that the log-ODE and parabola-ODE methods are both locally high order; however there is a significant difference in how these methods propagate local errors. The reason is that the $O(h^{2})$ components of the log-ODE local errors give a martingale, whilst the $O(h^{2})$ part for each parabola-ODE local error has non-zero expectation. Thus the log-ODE method is globally high order whilst the parabola method is not. However, since the parabola-ODE method is straightforward to implement and locally high order, one could expect it to perform well compared to other low order methods. In the numerical example, we shall see that the parabola method has the same order of convergence as the piecewise linear approach but gives significantly smaller errors. To conclude this section, we will present the orders of convergence for both methods.

Definition 3.31 (Strong convergence).

A numerical solution $Y$ for (21) is said to converge in a strong sense with order $\alpha$ if there exists a constant $C>0$ such that

[TABLE]

for all sufficiently small step sizes $h=\frac{T}{N}$ .

Definition 3.32 (Weak convergence).

A numerical solution $Y$ for (21) is said to converge in a weak sense with order $\beta$ if for any polynomial $p$ there exists $C_{p}>0$ such that

[TABLE]

for all sufficiently small step sizes $h=\frac{T}{N}$ .

Theorem 3.33 (Orders of convergence).

For a general SDE (21), the log-ODE method converges in a strong sense with order 1.5 and a weak sense with order 2.0. The parabola-ODE method converges in both a strong and weak sense with order 1.0.

Proof 3.34.

Note that Theorem 3.29 establishes the Taylor expansions of both methods. The strong convergence can then be shown as in the proof of Theorem 11.5.1 in [22]. Moreover, the proof of Theorem 11.5.1 also provides the orders of strong convergence. Similarly weak convergence follows directly from the Taylor expansions (29) & (30), and the rate of convergence can be shown as in the proof of Theorem 14.5.2 in [22].

4 A numerical example

We shall demonstrate the ideas presented so far using various discretizations of Inhomogeneous Geometric Brownian Motion (IGBM)

[TABLE]

where $a\geq 0$ and $b\in\mathbb{R}$ are the mean reversion parameters and $\sigma\geq 0$ is the volatility. As the vector fields are smooth, the SDE (32) can be expressed in Stratonovich form:

[TABLE]

where $\tilde{a}:=a+\frac{1}{2}\sigma^{2}$ and $\tilde{b}:=\frac{2ab}{2a+\sigma^{2}}$ denote the “adjusted” mean reversion parameters.

IGBM is an example of a one-factor short rate model and has seen recent attention in the mathematical finance literature as an alternative to popular models [16, 17]. IGBM is also one of the simplest SDEs that has no known method of exact simulation. We will investigate the strong and weak convergence rates of the following methods:

Log-ODE method (see definition 3.27)

Since the vector fields of (33) give constant Lie brackets, this method becomes

[TABLE] 2. 2.

Parabola-ODE method (see definition 3.28)

As the SDE (33) is quite analytically tractable, this method is expressible as

[TABLE]

The integral above will be computed by $3$ -point Gauss-Legendre quadrature. 3. 3.

Piecewise linear method (see [25] for definition and proof of convergence)

Just as above, this method can be simplified to give a straightforward formula.

[TABLE] 4. 4.

Milstein method (see section 6 of [2] and section 10.3 of [22] for overviews)

For this method, we shall take the positive part to guarantee non-negativity.

[TABLE] 5. 5.

Euler-Maruyama method (see sections 4, 5 of [2] and section 10.2 of [22])

Just as above, we take the positive part of each step to ensure non-negativity.

[TABLE]

Note that the explicit formula for the log-ODE method comes from the Lie brackets:

[TABLE]

and the formula for the parabola-ODE method was derived using a change of variable.

The Euler-Maruyama and Milstein methods are included in the numerical experiment as benchmarks to test how the proposed methods compare to well-known methods. As before, we will be discretizing the SDE over a uniform partition with mesh size $h$ .

Below is the definition of the error estimators used to analyse the numerical methods.

Definition 4.35 (Strong and weak error estimators).

For each $N\geq 1$ , let $Y_{N}$ denote a numerical solution of (32) computed at time $T$ using a fixed step size $h=\frac{T}{N}$ . We can define the following estimators for quantifying strong and weak convergence:

[TABLE]

where the above expectations are approximated by standard Monte-Carlo simulation and $Y_{T}^{fine}$ is the numerical solution of (32) obtained at time $T$ using the log-ODE method with a “ fine” step size of $\min\left(\frac{h}{10},\frac{T}{1000}\right)$ . The fine step size is chosen so that the $L^{2}(\mathbb{P})$ error between $Y_{T}^{fine}$ and the true solution $y$ is negligible compared to $S_{N}$ . Note that $Y_{N}$ and $Y_{T}^{fine}$ are both computed with respect to the same Brownian paths.

In this numerical example, we shall use the same parameter values as in [16], namely $a=0.1$ , $b=0.04$ , $\sigma=0.6$ and $y_{0}=0.06$ . We will also fix the time horizon at $T=5$ .

We will now present our results for the numerical experiment that is described above. (Code for this example can be found at github.com/james-m-foster/igbm-simulation)

From the above graph we see that the log-ODE method is by far the most accurate. This is epitomized by the fact that the numerical error produced by $100$ steps of the log-ODE method is comparable to the error of the parabola method with $1000$ steps. In addition, whilst there are three methods that share the same order of convergence it is evident there are magnitudes of difference between their respective accuracies. For example, the parabola method is seven times more accurate than piecewise linear. As one might expect, the Euler-Maruyama and Milstein schemes both perform poorly.

Nevertheless, in order to truly measure the performance of these numerical methods, we should consider the computational costs required for achieving a specified accuracy.

The above graph demonstrates that the log-ODE method is especially well-suited for weak approximation as it achieves a second order convergence rate in this example. Surprisingly, the middle three methods exhibit almost identical convergence profiles. As before, we can estimate the computational time needed to achieve given accuracies.

We expect the log-ODE and parabola methods to have about twice the computational cost as the other methods because each step requires generating two random variables. Table 3 confirms this and thus sampling may be a bottleneck for these methods. So overall, the numerical evidence supports our claim that the high order log-ODE method is currently a state-of-the-art method for the pathwise discretization of IGBM.

5 Conclusion

There are primarily three new results established in this paper:

$\bullet$

An efficient strong polynomial approximation of Brownian motion

The main result allows one to construct a “smoother” Brownian motion as a finite sum of $(\text{-}1,\text{-}1)$ -Jacobi polynomials with independent Gaussian weights. Moreover, it was shown that the approximation is optimal in a weighted $L^{2}(\mathbb{P})$ sense and the surrounding noise is an independent centered Gaussian process.

$\bullet$

Unbiased approximation of third order Brownian iterated integrals

Iterated integrals of Brownian motion and time are important objects in the study of SDEs as they appear naturally within stochastic Taylor expansions. We have derived the $L^{2}(\mathbb{P})$ -optimal estimator for a class of such integrals that is measurable with respect to the path’s increment and space-time Lévy area.

$\bullet$

Simulation of Inhomogeneous Geometric Brownian Motion (IGBM)

IGBM is a mean-reverting short rate model used in mathematical finance and also one of the simplest SDEs that has no known method of exact simulation.

By incorporating the new iterated integral estimator into the log-ODE method we have developed a high order state-of-the-art numerical method for IGBM.

Furthermore, the results of this paper naturally lead to the following open questions:

$\bullet$

Which weight functions give “ explicit eigenfunctions” for Brownian motion? (For example, we could try $w(x)=x$ or $w(x)=\frac{1}{x}$ with $K_{W}(s,t)=\min(s,t)$ )

$\bullet$

Is it possible to generalize the main theorem to fractional Brownian motion?

$\bullet$

What are the most efficient Runge-Kutta methods for general one-dimensional SDEs that correctly use the new estimator for third order iterated integrals?

$\bullet$

Is this polynomial expansion optimal for approximating Lévy area? (see [13])

$\bullet$

Which conditional moments can be computed for a given stochastic integral?

$\bullet$

How might we construct a piecewise linear path $\gamma$ with the below properties?

[TABLE]

$\bullet$

Would this method of construction lead to effective cubature paths? (see [26])

Given such a path, we can approximate (21) with a “ piecewise linear” ODE.

[TABLE]

(Along each piece of $\gamma$ , we would discretize (36) using an appropriate solver)

$\bullet$

How effective is the above piecewise linear ODE method for simulating SDEs?

$\bullet$

Can we extend the approximations given in this paper to the SPDE setting?

Bibliography26

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] R. M. Mazo , Brownian Motion: Fluctuations, Dynamics and Applications , Clarendron Press, Oxford, 2002.
2[2] D. J. Higham , An algorithmic introduction to numerical simulation of stochastic differential equations , SIAM Review, Volume 43, 2001.
3[3] J. M. C. Clark and R. J. Cameron , The maximum rate of convergence of discrete approximations for stochastic differential equations , Stochastic Differential Systems Filtering and Control, Volume 25 in Lecture Notes in Control and Information Sciences, Springer, 1980.
4[4] D. S. Grebenkov, D. Belyaev and P. W. Jones , A multiscale guide to Brownian motion , Journal of Physics A, Volume 49, 2015.
5[5] K. Habermann , A semicircle law and decorrelation phenomena for iterated Kolmogorov loops , https://arxiv.org/abs/1904.11484 , 2019.
6[6] F. Castell and J. G. Gaines , The ordinary differential equation approach to asymptotically efficient schemes for solution of stochastic differential equations , Annales de l’Institut Henri Poincaré, Volume 32, 1996.
7[7] S. J. A. Malham and A. Wiese , Stochastic Lie group integrators , Siam Journal on Scientific Computing, Volume 30, 2008.
8[8] J. M. C. Clark , An efficient approximation for a class of stochastic differential equations , Advances in Filtering and Optimal Stochastic Control, Proceedings of the IFIP-WG 7/1 Working Conference. Cocoyoc, Mexico, 1982. Lecture Notes in Control and Information Sciences, Volume 42, Berlin, 1982.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Code & Models

Videos

An optimal polynomial approximation of

Abstract

keywords:

AMS:

1 Introduction

Theorem 1**.**

Theorem 2**.**

Theorem 3**.**

1.1 Notation

2 Main result

Definition 4** (Alpert-Rokhlin wavelets).**

Theorem 5** (A polynomial Karhunen-Loève theorem for the Brownian bridge).**

Proof.

Theorem 6**.**

Theorem 7**.**

Proof.

2.1 Properties of orthogonal polynomials

Definition 8**.**

Lemma 9**.**

Proof.

Theorem 10**.**

Proof.

Theorem 11** (Recurrence relation for (-1,-1)(\text{-}1,\text{-}1)(-1,-1)-Jacobi polynomials).**

Proof.

Theorem 12** (Relationship between the Jacobi-like and Legendre polynomials).**

Proof.

3 Applications to SDEs

Theorem 13** (High order Stratonovich-Taylor expansion).**

Theorem 14**.**

Proof.

Definition 15**.**

Definition 16**.**

Definition 17**.**

Remark 3.18**.**

Theorem 3.19** (Conditional expectation of a non-trivial Brownian time integral).**

Proof 3.20**.**

Definition 3.21**.**

Theorem 3.22**.**

Proof 3.23**.**

Theorem 3.24** (Conditional moments of Brownian space-space-time Lévy area).**

Proof 3.25**.**

Remark 3.26**.**

3.1 Examples of ODE methods

Definition 3.27** (High order log-ODE method).**

Definition 3.28** (The parabola-ODE method).**

Theorem 3.29**.**

Proof 3.30**.**

Definition 3.31** (Strong convergence).**

Definition 3.32** (Weak convergence).**

Theorem 3.33** (Orders of convergence).**

Proof 3.34**.**

4 A numerical example

Definition 4.35** (Strong and weak error estimators).**

5 Conclusion

Theorem 1.

Theorem 2.

Theorem 3.

Definition 4 (Alpert-Rokhlin wavelets).

Theorem 5 (A polynomial Karhunen-Loève theorem for the Brownian bridge).

Theorem 6.

Theorem 7.

Definition 8.

Lemma 9.

Theorem 10.

Theorem 11 (Recurrence relation for $(\text{-}1,\text{-}1)$ -Jacobi polynomials).

Theorem 12 (Relationship between the Jacobi-like and Legendre polynomials).

Theorem 13 (High order Stratonovich-Taylor expansion).

Theorem 14.

Definition 15.

Definition 16.

Definition 17.

Remark 3.18.

Theorem 3.19 (Conditional expectation of a non-trivial Brownian time integral).

Proof 3.20.

Definition 3.21.

Theorem 3.22.

Proof 3.23.

Theorem 3.24 (Conditional moments of Brownian space-space-time Lévy area).

Proof 3.25.

Remark 3.26.

Definition 3.27 (High order log-ODE method).

Definition 3.28 (The parabola-ODE method).

Theorem 3.29.

Proof 3.30.

Definition 3.31 (Strong convergence).

Definition 3.32 (Weak convergence).

Theorem 3.33 (Orders of convergence).

Proof 3.34.

Definition 4.35 (Strong and weak error estimators).