Controlling the time discretization bias for the supremum of Brownian   Motion

Krzysztof Bisewski; Daan Crommelin; Michel Mandjes

arXiv:1705.06567·math.PR·April 9, 2019·ACM Trans. Model. Comput. Simul.

Controlling the time discretization bias for the supremum of Brownian Motion

Krzysztof Bisewski, Daan Crommelin, Michel Mandjes

PDF

TL;DR

This paper investigates the bias from time discretization in estimating Brownian motion crossing probabilities, proposing threshold-dependent grids that improve efficiency and reduce bias compared to equidistant grids.

Contribution

It introduces a novel threshold-dependent discretization method that reduces the number of grid points needed, making bias correction more efficient and broadly applicable.

Findings

01

Threshold-dependent grids require fewer points, independent of threshold b.

02

The proposed algorithm is strongly efficient for estimating crossing probabilities.

03

Empirical results show significant performance improvements over equidistant grids.

Abstract

We consider the bias arising from time discretization when estimating the threshold crossing probability $w (b) := P (sup_{t \in [0, 1]} B_{t} > b)$ , with $(B_{t})_{t \in [0, 1]}$ a standard Brownian Motion. We prove that if the discretization is equidistant, then to reach a given target value of the relative bias, the number of grid points has to grow quadratically in $b$ , as $b$ grows. When considering non-equidistant discretizations (with threshold-dependent grid points), we can substantially improve on this: we show that for such grids the required number of grid points is independent of $b$ , and in addition we point out how they can be used to construct a strongly efficient algorithm for the estimation of $w (b)$ . Finally, we show how to apply the resulting algorithm for a broad class of stochastic processes; it is empirically shown that the threshold-dependent grid significantly…

Equations304

w (b) := P (t \in [0, 1] sup X_{t} > b) .

w (b) := P (t \in [0, 1] sup X_{t} > b) .

Δ (T) = t \in [0, 1] sup X_{t} - t \in T sup X_{t} .

Δ (T) = t \in [0, 1] sup X_{t} - t \in T sup X_{t} .

\displaystyle w(b):=\mathbb{P}\bigg{(}\sup_{t\in[0,1]}B_{t}>b\bigg{)}.

\displaystyle w(b):=\mathbb{P}\bigg{(}\sup_{t\in[0,1]}B_{t}>b\bigg{)}.

\displaystyle w_{T}(b):=\mathbb{P}\bigg{(}\sup_{t\in T}B_{t}>b\bigg{)},

\displaystyle w_{T}(b):=\mathbb{P}\bigg{(}\sup_{t\in T}B_{t}>b\bigg{)},

\displaystyle\beta_{T}(b):=\frac{w(b)-w_{T}(b)}{w(b)}=\mathbb{P}\Big{(}\sup_{t\in T}B_{t}<b\ \big{|}\sup_{t\in[0,1]}B_{t}>b\Big{)}

\displaystyle\beta_{T}(b):=\frac{w(b)-w_{T}(b)}{w(b)}=\mathbb{P}\Big{(}\sup_{t\in T}B_{t}<b\ \big{|}\sup_{t\in[0,1]}B_{t}>b\Big{)}

\displaystyle\beta_{T}(b)=1-\frac{w_{T}(b)}{w(b)}\leq 1-\frac{\mathbb{P}\big{(}B_{1}>b\big{)}}{2\,\mathbb{P}\big{(}B_{1}>b\big{)}}=\frac{1}{2}.

\displaystyle\beta_{T}(b)=1-\frac{w_{T}(b)}{w(b)}\leq 1-\frac{\mathbb{P}\big{(}B_{1}>b\big{)}}{2\,\mathbb{P}\big{(}B_{1}>b\big{)}}=\frac{1}{2}.

C_{1} n^{- 1/2} \leq β_{n} (b) \leq C_{2} n^{- 1/2} .

C_{1} n^{- 1/2} \leq β_{n} (b) \leq C_{2} n^{- 1/2} .

β_{n} (b) \leq C_{0} \cdot b n^{- 1/2},

β_{n} (b) \leq C_{0} \cdot b n^{- 1/2},

β_{n} (b) \leq C_{1} \cdot n^{- 1/2},

β_{n} (b) \leq C_{1} \cdot n^{- 1/2},

n \leq m (b) in f β_{n} (b) ⟶ \frac{1}{2} .

n \leq m (b) in f β_{n} (b) ⟶ \frac{1}{2} .

a_{j} (b)

a_{j} (b)

w_{j} (b)

τ_{b}

\displaystyle\underline{$\beta$}_{T}(b)\leq\beta_{T}(b)\leq\bar{\beta}_{T}(b)

\displaystyle\underline{$\beta$}_{T}(b)\leq\beta_{T}(b)\leq\bar{\beta}_{T}(b)

\displaystyle\underline{$\beta$}_{T}(b):=\frac{1}{2}\sum_{j=1}^{n}a_{j+1}(b)\,w_{j}(b),\ \ \ \ \ \bar{\beta}_{T}(b):=\sum_{j=1}^{n}a_{j}(b)\,w_{j}(b).

\displaystyle\underline{$\beta$}_{T}(b):=\frac{1}{2}\sum_{j=1}^{n}a_{j+1}(b)\,w_{j}(b),\ \ \ \ \ \bar{\beta}_{T}(b):=\sum_{j=1}^{n}a_{j}(b)\,w_{j}(b).

\displaystyle C^{*}_{1}n^{-1/2}\leq\mathbb{P}\Big{(}B_{1}>0,\ldots,B_{n}>0\Big{)}\leq C^{*}_{2}n^{-1/2}

\displaystyle C^{*}_{1}n^{-1/2}\leq\mathbb{P}\Big{(}B_{1}>0,\ldots,B_{n}>0\Big{)}\leq C^{*}_{2}n^{-1/2}

β_{n} (b)

β_{n} (b)

a_{1} \cdot w_{1} (b) \leq C_{2}^{*} n^{- 1/2}

a_{1} \cdot w_{1} (b) \leq C_{2}^{*} n^{- 1/2}

j = 2 \sum n - 1 a_{j} \cdot w_{j} (b)

j = 2 \sum n - 1 a_{j} \cdot w_{j} (b)

\leq C_{1} \cdot b n^{- 1/2} \cdot j = 2 \sum n - 1 \frac{1}{n} \cdot \frac{b}{1 - \frac{j}{n}} \cdot (\frac{j}{n})^{- 3/2} e^{- \frac{b ^{2}}{2} \cdot (\frac{n}{j} - 1)}

\leq C_{1} \cdot b n^{- 1/2} \cdot \int_{0}^{1} \frac{b}{1 - x} \cdot x^{- 3/2} \cdot e^{- \frac{b ^{2}}{2} (1/ x - 1)} d x

\leq C_{1} \cdot b n^{- 1/2},

f (b, x) := \frac{b}{1 - x} x^{- 3/2} e^{- \frac{b ^{2}}{2} (1/ x - 1)};

f (b, x) := \frac{b}{1 - x} x^{- 3/2} e^{- \frac{b ^{2}}{2} (1/ x - 1)};

a_{n} \cdot w_{n} (b) \leq \frac{1}{2} \cdot \frac{b ( b + b ^{2} + 4 )}{4} \frac{n}{( n - 1 ) ^{3/2}} \leq C_{2} \frac{b ^{2}}{n},

a_{n} \cdot w_{n} (b) \leq \frac{1}{2} \cdot \frac{b ( b + b ^{2} + 4 )}{4} \frac{n}{( n - 1 ) ^{3/2}} \leq C_{2} \frac{b ^{2}}{n},

a_{n} \cdot w_{n} (b) \leq min {C_{2} \frac{b ^{2}}{n}, \frac{1}{2}} \leq min {C_{2} \frac{b ^{2}}{n}, \frac{1}{2}} \leq C_{2} b n^{- 1/2} .

a_{n} \cdot w_{n} (b) \leq min {C_{2} \frac{b ^{2}}{n}, \frac{1}{2}} \leq min {C_{2} \frac{b ^{2}}{n}, \frac{1}{2}} \leq C_{2} b n^{- 1/2} .

β_{n} (b)

β_{n} (b)

\leq C_{0} b n^{- 1/2},

w (b) β_{T} (b)

w (b) β_{T} (b)

\displaystyle=\sum_{j=1}^{n}\mathbb{P}\Big{(}\sup_{t\in\{t_{j},\ldots,t_{n}\}}B_{t}<b,\tau_{b}\in(t_{j-1},t_{j}]\Big{)}\geq\mathbb{P}\Big{(}B_{t_{n}}<b,\tau_{b}\in(t_{n-1},t_{n}]\Big{)}

= \int_{t_{n - 1}}^{t_{n}} P (B_{t_{n}} < b ∣ B_{s} = b) P (τ_{b} \in d s) = \frac{1}{2} P (τ_{b} \in (t_{n - 1}, t_{n}))

β_{T} (b)

β_{T} (b)

n \leq m (b) in f β_{n} (b) \geq \frac{1}{2} - \frac{1}{2} \frac{Φ ( - b / ( m - 1 ) / m )}{Φ ( - b )} .

n \leq m (b) in f β_{n} (b) \geq \frac{1}{2} - \frac{1}{2} \frac{Φ ( - b / ( m - 1 ) / m )}{Φ ( - b )} .

b \to \infty lim n \leq m (b) in f β_{n} (b)

b \to \infty lim n \leq m (b) in f β_{n} (b)

= \frac{1}{2} - \frac{1}{2} b \to \infty lim \frac{\frac{( m - 1 ) / m}{b} ϕ ( b / ( m - 1 ) / m )}{\frac{1}{b} ϕ ( b )}

= \frac{1}{2} - \frac{1}{2} b \to \infty lim e^{- b^{2} / (2 (m - 1))} = \frac{1}{2}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Controlling the time discretization bias

for the supremum of Brownian Motion

Krzysztof Bisewski111Email: [email protected]

Centrum Wiskunde & Informatica, Amsterdam

Daan Crommelin

Centrum Wiskunde & Informatica, Amsterdam

Korteweg de Vries Institute for Mathematics, University of Amsterdam

Michel Mandjes

Korteweg de Vries Institute for Mathematics, University of Amsterdam

(September 13, 2017 )

Controlling the time discretization bias

for the supremum of Brownian Motion

Krzysztof Bisewski111Email: [email protected]

Centrum Wiskunde & Informatica, Amsterdam

Daan Crommelin

Centrum Wiskunde & Informatica, Amsterdam

Korteweg de Vries Institute for Mathematics, University of Amsterdam

Michel Mandjes

Korteweg de Vries Institute for Mathematics, University of Amsterdam

(September 13, 2017 )

Abstract

We consider the bias arising from time discretization when estimating the threshold crossing probability $w(b):=\mathbb{P}(\sup_{t\in[0,1]}B_{t}>b)$ , with $(B_{t})_{t\in[0,1]}$ a standard Brownian Motion. We prove that if the discretization is equidistant, then to reach a given target value of the relative bias, the number of grid points has to grow quadratically in $b$ , as $b$ grows. When considering non-equidistant discretizations (with threshold-dependent grid points), we can substantially improve on this: we show that for such grids the required number of grid points is independent of $b$ , and in addition we point out how they can be used to construct a strongly efficient algorithm for the estimation of $w(b)$ . Finally, we show how to apply the resulting algorithm for a broad class of stochastic processes; it is empirically shown that the threshold-dependent grid significantly outperforms its equidistant counterpart.

{textblock}

0.80.5,0.5 Submitted on March 17, 2017

1 Introduction

Extreme values of random processes play a prominent role in a broad range of practical problems. It is often of interest to find the tail of the distribution of the supremum of a continuous-time stochastic process $(X_{t})_{t\geq 0}$ over a finite time interval. In this paper the focus is on the level crossing probability

[TABLE]

For many classes of processes, such as the Gaussian processes [Adler, 1990], typically no explicit expressions for $w(b)$ are available, with Brownian Motion and the Ornstein-Uhlenbeck process being notable exceptions. When an explicit expression for $w(b)$ is unavailable one usually resorts to using high-dimensional numerical integration and simulation-based methods, see e.g. [Genz and Bretz, 2009] for further reading.

For most of the available numerical methods, the underlying continuous-time process needs to be discretized in time. One chooses a certain finite grid $T\subset[0,1]$ and then approximates $w(b)$ with $w_{T}(b):=\mathbb{P}\big{(}\sup_{t\in T}X_{t}>b\big{)}$ . We note that this always leads to an underestimation, i.e., $w_{T}(b)\leq w(b)$ . We quantify this underestimation by $\beta_{T}(b):=(w(b)-w_{T}(b))/w(b)$ , the relative discretization bias222As $b\to\infty$ , both $w_{T}(b)$ and $w(b)$ tend to [math], so that the absolute bias is not a meaningful accuracy measure.. Typically $T$ is chosen to be an equidistant grid $T=\{\frac{1}{n},\frac{2}{n},\ldots,1\}$ and in that case, $\beta_{T}(b)$ can be reduced only by changing the grid size $n$ . The finer the grid, the smaller the bias, but also, the larger the computational effort to estimate $w_{T}(b)$ . The main drawback of using equidistant grids is that typically, to reach a given target value of the discretization bias, the grid size $n$ has to grow with the threshold $b$ . In that case, for large $b$ , the appropriate grid size can become so large that the computation is not feasible. Two central questions arise from these observations: How fast does $n$ have to grow in $b$ ? Furthermore, can we identify a more efficient family of grids?

In this paper we address these issues for standard Brownian Motion. Although in this case $w(b)$ can be computed explicitly, there are no available expressions for $\beta_{T}(b)$ . We conduct a thorough study of the influence of the choice of the grid on the corresponding relative bias. Furthermore, we argue that exploring the case of standard Brownian Motion is a first step towards finding efficient grids for a more general class of processes. We demonstrate numerically how our analysis of efficient grids for Brownian Motion leads to a useful procedure to determine efficient grids for a broad range of other processes.

The contributions of this paper are the following. (i) The first finding can be seen as a negative result: we show that to uniformly control333In this context uniform control means that for a fixed $\varepsilon>0$ , we have that $\beta_{T}(b)<\varepsilon$ for all $b>0$ ; the grid $T$ can change in $b$ . the relative bias, the size $n$ of the equidistant grid must grow at least quadratically in $b$ ; see Theorem 1 in Section 3. (ii) The second finding is that we can do much better by using a threshold-dependent family of grids, meaning that grid points change their location with $b$ (but the number of points does not increase). The discretization bias induced by this particular family of grids is uniformly controlled without having to increase the number of grid points; see Theorem 2 in Section 4. According to the best of the authors’ knowledge, this is the first result which shows that a careful choice of the grid can drastically increase the accuracy of the discrete estimator of $w(b)$ . Using threshold-dependent grids makes it feasible to estimate $w(b)$ with moderate grid sizes even for very high thresholds $b$ , which would be impossible to estimate using equidistant grids. In particular, in Section 5 we present a strongly efficient algorithm for the estimation of $w(b)$ that relies on threshold-dependent grids. (iii) In the third place, we point out how the ideas underlying our threshold-dependent grid can be used for a broad class of stochastic processes (including Gaussian processes, such as fractional Brownian Motion, and Lévy processes); it is empirically shown that the threshold-dependent grid significantly outperforms its equidistant counterpart.

An efficient grid (both small in size and inducing a small discretization bias) is particularly relevant for situations with large $b$ . In this respect, the work presented here connects to the rare event simulation literature. As $b$ approaches infinity, $w(b)$ decays exponentially to [math] and standard simulation-based methods like Crude Monte Carlo to estimate $w(b)$ become extremely time consuming. We emphasize that rare event simulation methods commonly aim to control the sampling error, not the bias due to the discretization. [Adler et al., 2012] develop an algorithm that is strongly efficient (with bounded relative sampling error) for estimation of $w_{T}(b)$ (rather than $w(b)$ ). We will show that combining their algorithm with the use of threshold-dependent grids provides a strongly efficient algorithm for estimation of $w(b)$ .

A topic closely related to ours concerns the quantification of the difference between the supremum of the stochastic process taken over $[0,1]$ and the supremum taken over a finite grid $T\subset[0,1]$ , i.e.

[TABLE]

There are several results in the literature that study the behavior of $\Delta(T)$ for standard Brownian Motion. [Asmussen et al., 1995] shown that for the equidistant grids $T^{\text{eq}}_{n}=\{\frac{1}{n},\ldots,\frac{n}{n}\}$ , $\sqrt{n}\,\Delta(T^{\text{eq}}_{n})$ has a tight, non-degenerate weak limit, as $n\to\infty$ and [Janssen and Van Leeuwaarden, 2009] derived an expansion for $\mathbb{E}\Delta(T^{\text{eq}}_{n})$ . For random grids $T^{\text{rnd}}_{n}=\{U_{1},\ldots,U_{n}\}$ , where $U_{1},\ldots,U_{n}$ are i.i.d. uniform samples on $(0,1)$ , independent of the Brownian Motion $(X_{t})_{t\in[0,1]}$ , [Calvin and Glynn, 1997] establish the weak limit of $\sqrt{n}\,\Delta(T^{\text{rnd}}_{n})$ . Finally, [Calvin, 1997] proposed a class of adaptive grids, meaning that the consecutive grid-points $t_{k+1}$ are chosen based on $((t_{1},B_{t_{1}}),\ldots,(t_{k},B_{t_{k}}))$ ; given any $\delta>0$ , an adaptive grid $T^{\delta}_{n}=\{t_{1}^{\delta},\ldots,t_{n}^{\delta}\}$ is provided such that $n^{1-\delta/2}\Delta(T^{\delta}_{n})$ has a weak limit.

In our study we do not focus on the difference $\Delta(T)$ between the values of the maxima of the discrete and continuous-time Brownian Motion, but rather on the $\beta_{T}(b)$ , i.e., the relative difference between the probabilities that these maxima lie above a certain fixed threshold.

There are several approaches to tackle the discretization bias available in the literature. Arguably, the most widely applicable method is Multilevel Monte Carlo (MLMC) [Giles, 2008]. It can be applied together with any numerical method that relies on discretization. The idea is to use several different levels of discretization and spend less computational effort (draw less samples) at the finest levels of discretization. MLMC effectively reduces the computational effort, and the time saved can be used to produce even finer levels of discretization. It could be interesting to explore the combination of MLMC method together with the idea of threshold-dependent grids but further exploiting this procedure lies beyond the scope of this article.

One of the methods that aims to directly decrease the bias induced by equidistant grids is continuity correction. Since the discrete-time approximation $w_{T}(b)$ is always smaller than $w(b)$ , one could slightly lower the threshold $b$ to compensate for the underestimation. [Broadie et al., 1997], using the machinery developed in [Siegmund, 1985], proposed a way of lowering the threshold which improves the rate of convergence of the relative bias from $O(n^{-1/2})$ , cf. Proposition 1, to $O(n^{-1})$ , as the number of grid points $n$ grows large. However, in the non-Brownian case, it remains a non-trivial problem how much $b$ should be decreased. In fact, there is no direct way of making sure whether lowering $b$ decreases the absolute relative bias, as lowering $b$ by too much leads to overcompensation and thus to an estimate that is larger than $w(b)$ . By contrast, it is straightforward to compare the bias induced by two different grids — the larger the discrete estimator $w_{T}(b)$ , the smaller the relative bias.

There are also several simulation-based algorithms that do not rely on pre-discretization. [Li and Liu, 2015] propose a strongly efficient algorithm for estimation of $w(b)$ for a large class of Gaussian processes (most prominently, processes with constant variance function). However, when the underlying process has a unique point of maximal variance (such as Brownian Motion), the algorithm requires the simulation of a random time $\tau\in[0,1]$ from a density $f(t)\propto\mathbb{P}(X_{t}>b)$ , which becomes a rare event simulation problem when $b$ is large. While for an arbitrary process, the random discretization proposed in the algorithm requires a computational effort cubic in the number of grid points (in order to simulate a discrete Gaussian path), pre-discretization requires only quadratic effort; see the discussion in Section 5.

This paper is organized as follows. Section 2 provides definitions, preliminaries, and develops a general intuition. In Section 3 we introduce useful upper and lower bounds for the discretization bias (see Lemma 1) and show that the number of points on the equidistant grid has to grow quadratically in the threshold $b$ in order to uniformly control the discretization bias. In Section 4, as an alternative to equidistant grids, we study threshold-dependent grids, which control the relative bias with a constant grid size, independently of $b$ . The proofs of all lemmas and a proposition are postponed to Section 8. In Section 5 we present an algorithm by [Adler et al., 2012], that we use throughout the paper for producing the numerical results; combining this algorithm with the use of threshold-dependent grids yields a strongly efficient algorithm for estimation of $w(b)$ , see Corollary 1. In Section 6 we apply threshold-dependent grids developed in previous section to stochastic processes other than Brownian Motion: Brownian Motion with jumps, Ornstein-Uhlenbeck process and fractional Brownian Motion. Lastly, in Section 7 we present concluding remarks and discuss some ideas for future research of optimal grids. In the appendices we collect various technical results used throughout the paper.

2 Preliminary results

Let $(B_{t})_{t\in[0,1]}$ be a standard Brownian Motion on the time interval $[0,1]$ with $B_{0}=0$ . We consider the probability of crossing a positive threshold $b$ , that is

[TABLE]

For a standard Brownian Motion, an explicit formula for the threshold-crossing probability (1) is known, namely $w(b)=2\,\mathbb{P}(B_{1}>b)$ , which follows directly using the reflection principle (see e.g. [Mörters and Peres, 2010]). Given a finite grid $T$ we define a discrete-time approximation of $w(b)$ :

[TABLE]

where $T=\{t_{1},\ldots,t_{n}\}$ is a finite subset of the interval $[0,1]$ , ordered such that $t_{1}<\ldots<t_{n}$ . As we are mostly interested in choosing the grid $T$ efficiently, we define the following performance measure.

Definition 1.

Let $T$ be a finite grid on $[0,1]$ , then

[TABLE]

is called the relative bias induced by the grid $T$ .

The second representation of relative bias in Definition 1 is especially intuitive. It means that the relative bias is the probability that $B_{t}$ stays below $b$ on the grid $T$ , given that its supremum over $[0,1]$ is greater than $b$ . Notice that any grid which includes the endpoint $t=1$ will induce a relative bias no greater than $\frac{1}{2}$ . Indeed, if $1\in T$ , then $w_{T}(b)=\mathbb{P}(\sup_{t\in T}B_{t}>b)\geq\mathbb{P}(B_{1}>b)$ and thus

[TABLE]

Our objective is to accurately estimate $w(b)$ using discrete approximations $w_{T}(b)$ , in a computationally efficient manner. Brownian Motion has continuous paths and thus it is always possible for a given $b$ to find a fine enough grid to bound the bias up to a desired accuracy. However, the computational cost of estimating $w_{T}(b)$ grows in the grid size and thus it might be infeasible to numerically compute $w_{T}(b)$ for large grids.

At this point, we emphasize that we are not as much interested in the behaviour of $\beta_{T}(b)$ for a fixed $b$ or a fixed $n$ but rather in asymptotic regimes in which $b$ and/or $n$ approach infinity. For every $b$ we allow to use a different grid so it seems natural to treat the grid as a function of threshold. For every $b$ we define a collection of grids of all possible sizes $\{T_{1}(b),T_{2}(b),\ldots\}$ , where $T_{n}(b)$ has $n$ elements, and we denote $\beta_{n}(b):=\beta_{T_{n}(b)}(b)$ . For a given family of grids we are interested in behavior of $\beta_{n}(b)$ as $n$ or $b$ tend to infinity. The most straightforward choice for the family of grids is the following.

Definition 2.

The family $\{T_{n}\}_{n\in\mathbb{N}}$ , where $T_{n}:=\{t^{n}_{1},\ldots,t^{n}_{n}\}$ with $t^{n}_{k}:=\frac{k}{n}$ is called the equidistant family of grids.

Notice that the location of grid points on the equidistant grid is independent of $b$ . Since the distance between neighboring points is equal to $\frac{1}{n}$ , and since Brownian paths are continuous, it follows that $\beta_{n}(b)\to 0$ , as $n\to\infty$ for any fixed $b$ . It has been established in [Asmussen et al., 1995] that for $T_{n}$ , equidistant grid, the difference between the continuous-time and discrete-time supremum $\varepsilon_{n}=\sup_{t\in[0,1]}B_{t}-\sup_{t\in T_{n}}B_{t}$ is of order $n^{-1/2}$ . More precisely, the sequence $(\sqrt{n}\varepsilon_{n})_{n\in\mathbb{N}}$ has a tight and non-degenerate weak limit.

Proposition 1.

Let $(B_{t})_{t\in[0,1]}$ denote standard Brownian Motion and $\{T_{n}\}_{n\in\mathbb{N}}$ be the equidistant family of grids from Definition 2 with $\beta_{n}(b):=\beta_{T_{n}}(b)$ . For any threshold $b>0$ there exist positive constants $C_{1},C_{2}$ such that

[TABLE]

The proof of the Proposition 1 is given in Section 8. The proof we give strongly resembles the proof of Theorem 1 below in Section 3, but we remark that it is also possible to derive it using the tools developed in [Broadie et al., 1997].

Proposition 1 states that $\beta_{n}(b)$ decays like $n^{-1/2}$ , when $n$ grows large for a fixed $b$ but it does not describe the behavior of the relative bias when $b$ varies. In Theorem 1 in the following section, we derive an upper bound for $\beta_{n}(b)$ for $n$ and $b$ simultaneously.

Figure 1 shows the evolution of the relative bias for four different thresholds $b=5,6,7,8$ against the size of the equidistant grid. Even though all four graphs show the $n^{-1/2}$ decay, the graphs rise up with growing threshold. In particular, for thresholds $b=5$ and $8$ respectively $n=700$ and $1700$ points are needed to arrive at around $10\%$ relative bias. It indicates that, as $b$ grows increasingly many grid-points are needed to arrive at the target relative bias. Using the threshold-dependent grid that we develop in Section 4 one can arrive at $10\%$ relative bias using approximately $n=100$ grid-points, independently of the value of the threshold. This amounts to a substantial improvement of the computational efficiency.

In some cases, the equidistant family of grids is the best possible choice, in the sense that other grid families require at least equally fast asymptotic growth of $n$ as $b$ increases, in order to control the relative bias. [Adler et al., 2012] prove that for centered, homogeneous and twice continuously differentiable (in a mean squared sense) Gaussian processes, $n$ has to grow linearly in $b$ to uniformly control the relative bias. Moreover, if $n$ grows sublinearly in $b$ , then the relative bias of any family of grids (not necessarily equidistant) tends to its maximal value, as $b$ approaches infinity. It is noted, however, that Brownian Motion does not belong to the family of Gaussian processes for which the result of [Adler et al., 2012] applies.

In the following two sections we analyze the asymptotic behavior of the relative bias $\beta_{n}(b)$ for two families of grids. We prove that the equidistant grid requires quadratic growth of $n$ in $b$ (see Theorem 1 in Section 3). As an alternative, we develop the threshold-dependent family of grids, for which we prove that the relative bias can be made arbitrarily small, uniformly in $b$ for fixed $n$ (see Theorem 2 in Section 4). We obtain a uniform rate of convergence in $n$ and also provide a closed-form expression for the threshold-dependent family of grids (see Definition (9) in Section 4).

3 Equidistant family of grids for Brownian Motion

This section is devoted to analyzing the asymptotic behavior of the relative bias for the equidistant family of grids. The methodology developed in this section will be used later to prove Theorem 2; in particular, the crucial part of the proof concerns bounds for the relative bias induced by an arbitrary finite grid, developed in Lemma 1.

The following theorem describes the asymptotic behaviour of the relative bias, under the equidistant family of grids.

Theorem 1.

Let $(B_{t})_{t\in[0,1]}$ denote standard Brownian Motion and $\{T_{n}\}_{n\in\mathbb{N}}$ be the equidistant family of grids from Definition 2 with $\beta_{n}(b):=\beta_{T_{n}}(b)$ .

(a)

Let $b_{0}$ be any positive, real number. There exist positive constants $C_{0},C_{1}$ , independent of $b$ and $n$ such that

[TABLE]

for all $b\geq b_{0}$ , and

[TABLE]

for all $b\in(0,b_{0}]$ . 2. (b)

Let $m:(0,\infty)\to(0,\infty)$ be such that $\lim_{b\to\infty}{m(b)}/{b^{2}}=0$ . Then, as $b\to\infty$ ,

[TABLE]

Part (a) of Theorem 1 states that $\beta_{n}(b)\leq C_{0}\,bn^{-1/2}$ , so that in order to bound $\beta_{n}(b)$ uniformly in $b$ it suffices to take $n=O(b^{2})$ . The second part of the Theorem 1 states that if $n=o(b^{2})$ then $\beta_{n}(b)\to 1/2$ , meaning that the relative bias cannot be bounded by an arbitrarily small number. Together, the two parts entail that the growth $n=O(b^{2})$ is sufficient and there is no better (slower) growth which would guarantee a uniformly bounded relative bias.

The crucial part of the proof of Theorem 1 is the method of bounding the relative bias. Since no explicit expressions for $w_{T}(b)$ or $\beta_{T}(b)$ are known (even if $T$ is an equidistant grid) we develop a general upper bound for $\beta_{T}(b)$ in the following lemma, in which we use the quantities

[TABLE]

Notice that in this definition of $a_{j}(b)$ and $w_{j}(b)$ we allow grid points $t_{1},\ldots,t_{n}$ to change their location with $b$ . In the present section, which is on equidistant grids, the grid points obviously do not depend on $b$ , but in later sections they do.

Lemma 1.

Let $T(b)=\{t_{1}(b),\ldots,t_{n}(b)\}\subset[0,1]$ , where $0<t_{1}(b)<\ldots<t_{n}(b)\leq 1$ , and let $t_{0}(b)=0$ . The following lower and upper bounds for $\beta_{T}(b)$ apply:

[TABLE]

with

[TABLE]

A short proof of Lemma 1 is included in Section 8. The bounds consist of elements of two types: $a_{j}(b)$ , the probability that $B_{t}$ stays negative at times $t_{j}-t_{j-1},\ldots,t_{n}-t_{j-1}$ , and $w_{j}(b)$ , the probability that $B_{t}$ hits $b$ for the first time in the interval $[t_{j-1},t_{j}]$ given that its supremum over $[0,1]$ is greater than $b$ .

For a general grid $T(b)$ , the probabilities $a_{j}(b)$ are difficult to control. However, when $T(b)$ is equidistant (thus independent of $b$ ), then also the probabilities $a_{j}$ are independent of $b$ ; we emphasize this independence by writing $a_{j}$ instead of $a_{j}(b)$ throughout this section. As a result, there exists a tight asymptotic bound for them (see Lemma 2 below); we were inspired to look into such quantities while reading [Mörters and Peres, 2010, Section 5]. The probabilities $w_{j}(b)$ are controlled using a mean value theorem, see Appendix B.V.

Lemma 2.

There exist constants $C^{*}_{1},C^{*}_{2}>0$ such that:

[TABLE]

for all $n\in\mathbb{N}$ .

In fact, the assertion in Lemma 2 is true for any symmetric random walk; see [Feller, 1971, Theorem 4 in Section XII.7, and Lemma 1 in Section XII.8]. Before proving Theorem 1 we present one more lemma.

Lemma 3.

Let $T=\{t_{1},\ldots,t_{n}\}$ be such that $t_{k}=\frac{k}{n}$ and let $t_{0}=0$ . Then the upper bound $\bar{\beta}_{T}(b)$ developed in Lemma 1 is an increasing function of $b$ .

An important implication of Lemma 3 is that for any $b_{0}>0$ we have that $\beta_{T}(b)\leq\bar{\beta}_{T}(b)\leq\bar{\beta}_{T}(b_{0})$ uniformly for all $b\leq b_{0}$ , which completely covers the statement on the situation that $b\leq b_{0}$ in part (a) of Theorem 1. The proof of Lemma 3 is given in Section 8.

Proof of Theorem 1(a).

Thanks to Lemma 3 it suffices to prove the first part of Theorem 1(a), i.e. we assume that $b\geq b_{0}$ . Without loss of generality we put $b_{0}=1$ . Exploiting the upper bound developed in Lemma 1 we decompose the sum $\sum_{j=1}^{n}a_{j}\cdot w_{j}(b)$ into three parts, which we treat separately:

[TABLE]

Using the definition of the equidistant grid and the scaling property of Brownian motion we can see that $a_{j}=\mathbb{P}\big{(}B_{t_{j}-t_{j-1}}<0,\ldots,B_{t_{n}-t_{j-1}}<0\big{)}=\mathbb{P}\big{(}B_{1}<0,\ldots,B_{n-j+1}<0\big{)}$ and the bound in Lemma 2 yields $a_{j}\leq C^{*}_{2}(n-j+1)^{-1/2}$ for all $j\in\{1,\ldots,n\}$ . Since all $w_{j}(b)\leq 1$ , we thus have a straightforward bound for the first term in (3):

[TABLE]

The second term we bound in the following fashion, relying on the upper bound that we have for $w_{j}(b)$ (stated in Result B.V),

[TABLE]

where $C_{2}^{*}$ comes from Lemma 2 and $C_{1}$ is a positive constant, independent of $b$ and $n$ . To arrive at (4) we use that $2(j-1)\geq j$ for all relevant $j$ . In the transition from (4) to (5) we use the definition of the Riemann sum for the function

[TABLE]

note that, since $f(b,x)$ is an increasing function of $x$ when $b\geq 1$ (see Result B.VI in the Appendix), the Riemann sum in (4) underestimates the integral, i.e., $\sum_{j=2}^{n}\frac{1}{n}f(b,\frac{j-1}{n})\leq\int_{0}^{1}f(b,x)\,dx=\sqrt{2\pi}$ .

Lastly, since $a_{n}=\mathbb{P}(B_{t_{n}}<0)=\frac{1}{2}$ we have

[TABLE]

where $C_{2}$ is a positive constant independent of $n$ and $b$ . Since $w_{n}(b)\leq 1$ this results in

[TABLE]

Combining the above bounds,

[TABLE]

where $C_{0}$ is a positive constant independent of $b$ and $n$ . This concludes the proof. ∎

Proof of Theorem 1(b).

Without loss of generality we can assume $m(b)\to\infty$ as $b\to\infty$ . Similar to the proof of Lemma 1 in Section 8 we obtain:

[TABLE]

Dividing both sides of the inequality by $w(b)$ yields an elementary lower bound on $\beta_{T}(b)$ :

[TABLE]

where $\Phi(\cdot)$ denotes the standard normal cdf, and we use the fact that $\mathbb{P}(\tau_{b}\leq t)=2\,\mathbb{P}(B_{t}>b)$ . In our case $t_{n}=1$ and $t_{n-1}=\frac{n-1}{n}\leq\frac{m-1}{m}$ , so that due to the monotonicity of $\Phi(\cdot)$

[TABLE]

Taking the limit $b\to\infty$ on both sides of inequality (7) yields:

[TABLE]

where $\phi(\cdot)$ denotes the standard normal pdf, we use result B.II in (8), and the last equality is a consequence of the assumption that $\lim_{b\to\infty}{m(b)}/{b^{2}}=0$ . ∎

In this section we have proven that in order to uniformly control the relative bias, the size of the equidistant grid must grow at least quadratically in $b$ , as $b$ approaches infinity. In the next section we present a threshold-dependent grid, which yields a uniform bound on the relative bias using a grid of given size. In other words, in order to control the relative bias with increasing $b$ , instead of adding more and more points to the grid, it suffices to suitably shift their location.

4 Threshold-dependent grids for Brownian Motion

In this section we prove the main result of the paper. We explicitly present a threshold-dependent family of grids which uniformly (in $b$ ) bounds the relative bias.

Before we introduce the result, we give some intuition why it is possible to control the relative bias as $b$ grows, without increasing $n$ . Firstly, for any given $\varepsilon>0$ , we have that

[TABLE]

as $b\to\infty$ . Therefore,

[TABLE]

It means that with growing $b$ , the ‘hitting of the threshold’ occurs closer and closer to time $t=1$ . It indicates that the grid points should be gradually shifted towards the point $t=1$ , as $b$ is increasing. Moreover, the result in Theorem 1 indicates how fast the points should be shifted. It states that for the family of equidistant grids, the uniform bound on the bias is achieved if the number of grid points grows quadratically in $b$ . Equivalently, the distances between neighboring points are decreasing proportionally to $b^{-2}$ . It turns out that this is indeed the pace at which the points should be shifted towards $t=1$ .

In the following result, $\Phi(\cdot)$ and $\Phi^{-1}(\cdot)$ denote the standard normal cdf and its inverse, respectively.

Theorem 2.

Let $(B_{t})_{t\in[0,1]}$ be a standard Brownian Motion. Fix $b_{0}>0$ and let $\{T_{n}(b)\}_{n\in\mathbb{N},b>0}$ be a family of grids such that $T_{n}(b)=\{t^{n}_{1}(b),\ldots,t^{n}_{n}(b)\}$ ; here $t^{n}_{k}(b):=\frac{k}{n}$ for $b\leq b_{0}$ , and

[TABLE]

for $b>b_{0}$ . Denote $\beta_{n}(b):=\beta_{T_{n}(b)}(b)$ . There exists a positive $C$ , independent of $b$ and $n$ , such that

[TABLE]

for all $b>0$ .

We emphasize that the bound for the relative bias $\beta_{n}(b)$ developed above does not depend on the threshold $b$ and thus holds uniformly, for all $b$ . Figure 2 shows the comparison between the relative bias of the equidistant and the threshold-dependent grid, both of size $n=100$ . The bias induced by the threshold-dependent grid remains uniformly bounded (by circa $0.1$ ), while the former tends to $0.5$ , the worst possible relative bias, cf. Theorem 1, part (b).

Notice that for small $b$ , $\{T_{n}(b)\}_{n\in\mathbb{N},b\in(0,b_{0}]}$ in Theorem 2 is identical to the equidistant family of grids. In fact this is exactly the setting of the second part of the Theorem 1(a). The real contribution of Theorem 2 is the regime when $b>b_{0}$ . The grid defined in (9) is the unique solution to the set of equations

[TABLE]

for all $k\in\{1,\ldots,n\}$ and $t_{0}:=0$ . To see this, we sum up the first $k$ equations in (10) and obtain an explicit equation for $t_{k}^{n}(b)$ :

[TABLE]

Since for Brownian Motion it holds that

[TABLE]

and in particular $\mathbb{P}(\tau_{b}\in(0,1])=2\,\mathbb{P}(B_{1}>b)=2\,\Phi(-b)$ , Eqn. (11) can be equivalently expressed as

[TABLE]

or, in terms of the cdf $\Phi(\cdot)$ ,

[TABLE]

Finally, after taking the inverse $\Phi^{-1}(\cdot)$ from both sides of the equation above we see that $t_{k}^{n}(b)$ satisfies (9). Figure 3 shows the placement of the grid-points on the grid $T_{5}(b)$ , as defined in (9), for increasing $b$ . In fact, one can prove that

[TABLE]

and thus

[TABLE]

for large $b$ . It means that the points of the grid (9) are clustered around $t=1$ , with distances between the points proportional to $b^{-2}$ . Here we see an important connection with Theorem 1(a), where the distances between grid-points decrease at the same pace, as already mentioned in the opening paragraph of this section.

For $b>b_{0}$ , the points $t^{n}_{1}(b),\ldots,t^{n}_{n}(b)$ of the threshold-dependent grid (9) do not coincide with the equidistant grid, entailing that we can not directly use Lemma 2 to control the terms of type $a_{j}(b)$ in the upper bound developed in Lemma 1 in Section 3. The following lemma resolves this issue.

Lemma 4.

Let $t_{0}=0<t_{1}<t_{2}<\ldots<t_{n}<\infty$ . Then

[TABLE]

for any $N\leq N_{n}$ , where

[TABLE]

A proof of this lemma is provided in Section 8. Lemma 2 applied to the upper bound in Lemma 4 yields a simple upper bound for $\mathbb{P}\big{(}B_{t_{1}}>0,\ldots,B_{t_{n}}>0\big{)}$ for any choice of $t_{0}=0<t_{1}<t_{2}<\ldots<t_{n}<\infty$ . In our case, after applying Lemma 1 we have to control probabilities of the type $\mathbb{P}\big{(}B_{t_{j}-t_{j-1}}<0,\ldots,B_{t_{n}-t_{j-1}}<0\big{)}$ , and thus we need a lower bound on

[TABLE]

which we give in the following lemma.

Lemma 5.

For the grid in (9), for $k>j$ , $b>0$ and $n\in\mathbb{N}$ we have:

[TABLE]

and when additionally $b\geq\sqrt{3}$ we have

[TABLE]

Lemma 5 is proven in Section 8. The lower bound in part (a) of Lemma 5 is in fact

[TABLE]

With these lemmas we can prove Theorem 2.

Proof of Theorem 2.

Part (a) of Theorem 1 states that for any choice of $b_{0}$ there exists positive $C_{1}$ such that $\beta_{n}(b)\leq C_{1}n^{-1/2}$ for $b\leq b_{0}$ and thus also $\beta_{n}(b)\leq C_{1}n^{-1/4}$ . Without the loss of generality, from now on we assume that $b>b_{0}=\sqrt{3}$ . Fix $n\in\mathbb{N}$ and denote $t_{k}:=t^{n}_{k}(b)$ for notational simplicity. After combining the general upper bound from Lemma 1 with the equivalent definition (10) of the threshold-dependent grid (9) we obtain

[TABLE]

observe that in our setting $w_{n}(b)=\frac{1}{n}.$ Moreover, Lemma 4 yields (recalling the definition of $a_{n}(b)$ )

[TABLE]

where

[TABLE]

Combining Lemma 2 with Lemma 5 gives

[TABLE]

with a constant $C>0$ that is independent of $n$ and $b$ . Notice that $\widetilde{N}_{n}(j)$ does not depend on $b$ . For $b>b_{0}$ we thus obtain

[TABLE]

where $C$ is a constant, independent from $n$ and $b$ , that might differ from line to line. In (14) we use the inequality $\log(1+x)\leq x$ and in (15) we use the convergence of the Riemann sum to the integral. This concludes the proof of Theorem 2. ∎

Remark 1.

For the purpose of showing that for any confidence level $\alpha$ and bias $\varepsilon$ , see also (16), the ‘equiprobable’ grid (as defined through (9)) requires a computational effort that is bounded in $b$ , it suffices that the decay of the upper bound for $\beta_{n}(b)$ in Theorem 2 is of order $n^{-1/4}$ ; see Corollary 1 in Section 5. As an aside we remark that we hypothesize that this decay is actually of order $n^{-1/2}$ . This is supported by numerical experiments; see Figure 4 where plots of $\beta_{n}(b)$ versus $n$ are shown for the threshold-dependent grid (9). The step we expect to be ‘loose’, in obtaining the bound of Theorem 2, is the one corresponding to Lemma 4. We conjecture that Lemma 4 is valid with

[TABLE]

(i.e., without the square root), which suffices to yield the $n^{-1/2}$ decay of $\beta_{n}(b)$ . **

5 Numerical algorithm for estimation of $w(b)$

As mentioned in the introduction, the family of threshold-dependent grids (9) can be used to construct a strongly efficient algorithm for estimation of $w(b)$ , see Corollary 1 below. In this paper, by ‘strongly efficient’ we mean that for any given accuracy $\varepsilon>0$ and confidence level $\alpha>0$ the computational time of an estimator $\widehat{w}(b)$ for $w(b)$ that satisfies

[TABLE]

is bounded independently of the threshold $b$ .

In all numerical experiments throughout this paper we used an algorithm developed by [Adler et al., 2012], see Algorithm 1 below. Although it is applicable for estimation of quantities such as $\mathbb{P}(\max_{i\in\{1,\ldots,n\}}X_{i}>b)$ , where $X\in\mathbb{R}^{n}$ is normally distributed with an arbitrary positive-definite covariance matrix, we present their algorithm for the specific case of Brownian Motion, as considered in this paper.

Algorithm 1 ([Adler et al., 2012]).

Choose a threshold $b$ and a finite grid $T=\{t_{1},\ldots,t_{n}\}\in[0,1]$ . The estimator $\widehat{w}_{T}(b)$ , computed according to the following algorithm, is an unbiased estimator of $w_{T}(b)$ .

Generate a random time $\tau$ on the grid, i.e. $\tau\in T$ , according to the law

[TABLE] 2. 2.

Generate $B_{\tau}$ under the condition $B_{\tau}>b$ . 3. 3.

Generate a discrete path of the Brownian Motion $(B_{t_{1}},\ldots,B_{t_{n}})$ conditioned on the pair $(\tau,B_{\tau})$ generated in the previous steps. 4. 4.

Compute

[TABLE]

[Adler et al., 2012] prove that the Algorithm 1 gives an unbiased estimator of $w_{T}(b)$ (not of $w(b)$ ) and that for a fixed $T$ (independent of $b$ ), the relative variance $\mathbb{V}\textnormal{{ar}}(\widehat{w}_{T}(b))/w_{T}^{2}(b)\to 0$ , as $b\to\infty$ . The authors also propose an estimator for $w(b)$ , which relies on a random discretization. However, with growing $b$ , one needs increasingly many random grid-points in order to control the relative bias, therefore the continuous-time algorithm is not strongly efficient. In order to reduce the sampling error one generates multiple replicas of the estimator and takes their average. Since every replica is based on a different grid, one must repeatedly calculate the Cholesky decomposition (whose computational time is cubic in the number of grid-points) in order to sample discrete Gaussian paths in Step 3 of Algorithm 1. Choosing a predefined grid speeds up this computation, as in that case the Cholesky decomposition has to be performed only once, making its computational cost negligible.

Combining the threshold-dependent grids as proposed in Section 4 with Algorithm 1 yields a strongly efficient estimator for $w(b)$ which is given in the corollary below.

Corollary 1 (Strongly efficient algorithm for the estimation of $w(b)$ ).

Fix an accuracy $\varepsilon>0$ and a confidence level $\alpha>0$ . Choose a grid $T:=T_{n}(b)$ from the family of grids defined in $(\ref{THE_grid1})$ such that $\beta_{T}(b):=\beta_{n}(b)<\varepsilon$ for all $b>0$ (this is possible due to the result in Theorem 2). Let $\widehat{w}^{(1)}_{T}(b),\ldots\widehat{w}^{(N)}_{T}(b)$ be i.i.d copies of the estimator from Algorithm 1, with

[TABLE]

Then

[TABLE]

satisfies

[TABLE]

and the computational effort to simulate $\widehat{w}(b)$ is bounded independently of $b$ .

Proof.

First notice that since $\beta_{T}(b)$ is uniformly bounded in $b$ (see Theorem 2), so that $N$ is fixed independently of $b$ , it follows that $\widehat{w}(b)$ can be computed in bounded time, independently of $b$ . It remains to prove that $\widehat{w}(b)$ satisfies the strong efficiency property (17). Note that $\widehat{w}(b)$ is an unbiased estimator of $w_{T}(b)$ , not of $w(b)$ . The relative variance of $\widehat{w}(b)$ with respect to $w_{T}(b)$ can be bounded independently of $b$ for an arbitrary choice of the grid in terms of the grid size $n$ ,

[TABLE]

Due to Chebyshev’s inequality,

[TABLE]

This concludes the proof. ∎

We conclude this section by a remark on the simulation of the conditioned Brownian Motion in Step 3 of Algorithm 1. The naïve method would be to construct the covariance matrix of the conditioned process, calculate the Cholesky decomposition of that matrix (cubic in the number of grid points) and then simulate the process in a standard manner. Notice that this step must be repeated for every replica $\widehat{w}^{(i)}_{T}(b)$ and thus its computational cost scales with the number of samples. The following algorithm, which can be found e.g. in [Doucet, 2010], requires only a single calculation of the Cholesky decomposition for all replicas.

Algorithm 2 ([Doucet, 2010]).

Let $X=(X_{1},X_{2})^{T}\in\mathbb{R}^{n}$ , where $X_{1}\in\mathbb{R}^{n-1}$ and $X_{2}\in\mathbb{R}$ , be normally distributed with mean $\mu$ and covariance matrix $\Sigma$ ,

[TABLE]

The following algorithm generates a sample $\overline{X}\sim(X_{1}|X_{2}=x_{2})$ :

Sample $Z=(X_{1},X_{2})^{T}\sim N(\mu,\Sigma)$ 2. 2.

Compute $\overline{X}=X_{1}+\Sigma_{12}\Sigma_{22}^{-1}(x_{2}-X_{2})$ .

Note that the computational effort to produce the conditioned Gaussian random variable $\overline{X}$ in Step 2 of Algorithm 2 is linear in the dimension $n$ . Thus, this algorithm significantly reduces the computation time of Step 3 of Algorithm 1 when that step is repeated for each replica.

6 Efficient grids for a broad class of stochastic processes

In this section we discuss how the idea of threshold-dependent grids can be applied to stochastic processes other than Brownian Motion. We let $(X_{t})_{t\in[0,1]}$ be a real-valued stochastic process and $t^{*}(b):=\operatorname*{arg\,max}_{t\in[0,1]}\mathbb{P}(X_{t}>b)$ . For simplicity we here assume that $t\mapsto\mathbb{P}(X_{t}>b)$ is continuous and strictly increasing so that $t^{*}(b)=1$ (but situations in which $t^{*}(b)\in(0,1)$ can be dealt with similarly, see also the discussion in Section 7).

As argued in the previous sections, it is efficient to let the position of the grid points depend on $b$ . We constructed for Brownian Motion a grid by finding $T(b)=\{t_{1}(b),\ldots,t_{n}(b)\}$ such that

[TABLE]

cf. (11). An inherent problem is that the class of processes for which the distribution of $\tau_{b}$ is known is very limited, so that the approach does not seem to be useful for relevant stochastic processes other than Brownian motion. We saw, however, that for Brownian Motion the $t_{k}(b)$ satisfying (18) also solve

[TABLE]

cf. (12). The idea now is to use the level-dependent (or: ‘equiprobable’) grid (19) for general real-valued processes. The major advantage of the grid (19) is that to calculate the position of the grid points $t_{k}$ the sole prerequisite is that the process’ marginals are known (rather than the distribution of $\tau_{b}$ ). In addition, even if the marginal distributions of $X_{t}$ are not available, but the asymptotics of $\mathbb{P}(X_{t}>b)$ (as $b\to\infty$ ) are, then a good approximation of this grid can be found. (In the sequel we write, for brevity, $T=\{t_{1},\ldots,t_{n}\}$ instead of $T(b)=\{t_{1}(b),\ldots,t_{n}(b)\}$ ) and $t^{*}$ instead of $t^{*}(b).$ )

We now provide the rationale behind the grid (19). Let $T$ be a grid such that $t^{*}\in T$ . Evidently, by the union bound,

[TABLE]

Now notice that if the grid $T$ is such that for $t\in T\setminus\{t^{*}\}$

[TABLE]

then it does not make sense to include the point $t$ for large $b$ . Property (20) clearly compromises the performance of equidistant grids as $b\to\infty.$ Considering however the grid points $t_{k}$ of the threshold-dependent grid, as defined by (19), these will by design not experience (20).

To assess the performance of the above threshold-dependent grid (19), we introduce a measure of performance closely related to the relative bias. Note that when no formulas for $w(b)$ are available, nor it is known how to reliably approximate $w(b)$ , we cannot determine the exact value of the relative bias. We now make the following two observations. (1) As $w_{T}(b)<w(b)$ for any choice of $T$ , the larger $w_{T}(b)$ is, the better; if $w_{T_{1}}(b)>w_{T_{2}}(b)$ for grids $T_{1},T_{2}$ , then also $\beta_{T_{1}}(b)<\beta_{T_{2}}(b)$ . (2) The crude lower bound $w(b)\geq\mathbb{P}(X_{t^{*}}>b)$ provides us with a useful benchmark. Combining these two thoughts motivates the following performance measure of a grid $T$ :

[TABLE]

Notice that for any $T$ such that $t^{*}\in T$ we have

[TABLE]

What is more, for any two grids $T_{1},T_{2}$ we have $\gamma_{T_{1}}(b)\geq\gamma_{T_{2}}(b)$ if and only if $\beta_{T_{1}}(b)\leq\beta_{T_{2}}(b)$ ; this means that the bigger the $\gamma_{T}(b)$ is, the better. As our main aim is to efficiently approximate $w(b)$ using discrete-time approximations $w_{T}(b)$ , we see that if $\gamma_{T}(b)\approx 1$ then there is little gain from using $w_{T}(b)$ over a deterministic estimator $\mathbb{P}(X_{t^{*}}>b)$ .

In a series of examples we compare $\gamma_{T}(b)$ induced by (i) the threshold-dependent (equiprobable) grid and (ii) the equidistant grid of the same size; we consistently use $n=100$ grid points. In all cases $t\mapsto\mathbb{P}(X_{t}>b)$ is a continuous, strictly increasing function (so that $t^{*}=1$ ). The most important conclusion is that the experiments below uniformly indicate that the equiprobable grid outperforms the equidistant one, not only in the asymptotic regime, as threshold $b$ grows large, but already for moderate values of $b$ . This shows how the ideas the we developed earlier this paper, that have provable optimality properties for Brownian motion, lead to an efficient estimation procedure for a much broader class of stochastic processes. In all examples, we observe that $\gamma_{T}(b)$ induced by the equidistant grid converges to $1$ , thus the corresponding $w_{T}(b)$ is asymptotically equivalent to $\mathbb{P}(X_{t^{*}}>b)$ , as $b\to\infty$ .

Example 1 (Brownian Motion with jumps).

Let $(X_{t})_{t\in[0,1]}$ be a Brownian Motion with jumps, i.e.

[TABLE]

where $B_{t}$ is a standard Brownian Motion and $N_{t}$ is a standard Poisson process with intensity $\lambda=1$ .

Even though there are no closed-form expressions for $w(b)$ , it is still possible to generate exact samples from $\sup_{t\in[0,1]}X_{t}$ (see [Dębicki and Mandjes, 2015, Section 10.1]). We can use this to construct an unbiased estimator of $w(b)$ and thus can estimate the relative bias of the tested grids. The results in Figure 5 show the substantial gain achieved by the level-dependent grid. The graphs look similar to those of Brownian Motion, which is indicative of the threshold-dependent grid having a uniformly bounded relative bias.**

Example 2 (Ornstein-Uhlenbeck Process).

Let $(X_{t})_{t\in[0,1]}$ be an Ornstein-Uhlenbeck process, i.e., a strong solution to the following SDE: with $X_{0}=0$ ,

[TABLE]

Then $(X_{t})_{t\in[0,1]}$ is a zero-mean Markovian Gaussian process with covariance function

[TABLE]

The exact value of $w(b)$ is known only in terms of special functions, see [Alili et al., 2005] and it is not straightforwardly evaluated. However, the exact asymptotics of $w(b)$ , as $b$ grows large, are known:

[TABLE]

where $C$ is a positive constant independent of $b$ , see e.g. [Dębicki and Mandjes, 2003, Theorem 5.1] or the original theorem by [Piterbarg and Prisyazhnyuk, 1978]; this explains why for the level-dependent grid $\gamma_{n}(b)$ goes to a constant in Figure 6. Again the equidistant grid is significantly outperformed by the threshold-dependent grid.**

Example 3 (Fractional Brownian Motion).

Let $(X_{t})_{t\in[0,1]}$ be a fractional Brownian Motion (fBM) with a Hurst parameter $H\in(0,1)$ , that is a zero-mean Gaussian process with the covariance function

[TABLE]

Observe that fBM with Hurst parameter $H=1/2$ is a standard Brownian Motion. For any $H$ we have $C_{H}(t,t)=t^{2H}$ (strictly increasing variance in time) and thus $t^{*}=1$ .

The exact value of the probability $w(b)$ for $H\neq 1/2$ remains unknown. However, like in Example 2, the exact asymptotics of $w(b)$ are known:

[TABLE]

where $C_{H}$ is a constant only depending on $H$ ; we again refer to [Dębicki and Mandjes, 2003, Theorem 5.1] or the original theorem by [Piterbarg and Prisyazhnyuk, 1978]. We apply threshold-dependent grids in these two different asymptotic regimes for $H=0.4$ and $H=0.6$ , see the results in Figure 7. Again the threshold-dependent grid performs considerably better. In case $H=0.4$ the above asymptotic result explains why for the level-dependent grid $\gamma_{n}(b)$ keeps increasing ( $w(b)/{\mathbb{P}}(X_{1}>b)$ behaves as the increasing function $b^{1/H-2}$ ). In case $H=0.6$ , again using the asymptotic result, $\gamma_{n}(b)\to 1$ as $b$ grows large, both for the equidistant grid and for the threshold-dependent grid (equivalently, the relative bias vanishes for both as $b\to\infty$ ). Note however that with the threshold-dependent grid, $\gamma_{n}(b)$ tends to 1 slower than with the equidistant grid, as can be seen in Figure 7 (right panel), showing the more favorable performance of the threshold-dependent grid. **

7 Concluding remarks and discussion

In this paper we have demonstrated that the errors due to time discretization when estimating threshold-crossing probabilities $w(b)$ can be significantly reduced by using other grids than the commonly used equidistant grid. We have analyzed this in considerable detail for the case of standard Brownian Motion. In particular, we have shown that in order to control the error as $b$ grows large, it suffices to properly shift the grid points instead of refining the grid with more and more points. At the same time, controlling the error using equidistant grids requires quadratic growth of the number of grid points, as $b$ grows large.

Numerical estimation is evidently not needed for Brownian Motion due to the availability of analytical results. Our paper however indicates that the underlying ideas can be used to construct efficient grids for a broad class of stochastic processes (notably, Lévy processes and Gaussian processes, such as fractional Brownian Motion). The results presented in this paper are intended to develop valuable insight and useful heuristics for tackling the estimation of tail probabilities of these more general classes of processes. We have demonstrated such heuristics for several processes in Section 6. There, we presented a procedure, that is empirically shown to work well for stochastic process $(X_{t})_{t\in[0,1]}$ of which the marginal distributions are known:

(i)

Identify

[TABLE]

in case $(X_{t})_{t\in[0,1]}$ is a zero-mean Gaussian process, $t^{*}$ is a point of maximal variance, i.e., $\operatorname*{arg\,max}_{t\in[0,1]}\mathbb{V}\textnormal{{ar}}\,X_{t}$ . As argued, for many key models we have that $t^{*}=1.$

(ii)

Construct a grid $T=\{t_{1},\ldots,t_{n}\}$ clustered around it, such that $t_{k}$ solves (19), for $k\in\{1,\ldots,n\}$ .

As we pointed out, even if the marginal distribution of $X_{t}$ is not available but only the corresponding asymptotics, as $b\to\infty$ , this procedure can be applied. It is also noted that it is straightforward to compare two different grids: the larger the value of $w_{T}(b)$ , the closer it is to the target quantity $w(b)$ .

A natural question that arises in relation to Theorem 2 is whether we can find a grid that is even better than the one defined in (9). Constructing an optimal n-grid $T^{*}_{n}(b)$ , i.e. a grid of size $n$ that minimizes the relative bias for a given $b$ , remains elusive. However we have been able to find an explicit formula for an optimal 2-grid, namely $T^{*}_{2}(b)=\{t^{*}_{1}(b),t^{*}_{2}(b)\}$ , with

[TABLE]

where $\lim_{b\to\infty}\beta_{T^{*}_{2}(b)}(b)=1-\frac{1}{2}\Phi(\sqrt{2/\pi})-\frac{1}{4}e^{-1/\pi}\approx 0.4244$ . For comparison, the threshold-dependent grid defined in (9) yields $\lim_{b\to\infty}\beta_{2}(b)=\frac{3}{8}+\frac{1}{2}\Phi(-\sqrt{2\log 2})\approx 0.4348$ , hence the grid (9) is not minimizing the bias (although the difference with the optimal 2-grid is small). Additionally, we were able to prove that for an optimal n-grid, $T^{*}_{n}(b)=\{t_{1}^{*}(b),\ldots,t_{n}^{*}(b)\}$ , the limits $\lim_{b\to\infty}b^{2}(1-t^{*}_{k}(b))$ must exist, and are all finite and pairwise distinct. As a result we were able to numerically calculate the limit $\lim_{b\to\infty}\beta_{T^{*}_{3}(b)}\approx 0.3796$ . Finding optimal grids for larger $n$ remains an open problem. We note, however, that with the threshold-dependent grid we can bound the relative bias uniformly in $b$ (see Theorem 2) and in this sense the grid (9) is already (asymptotically) optimal.

8 Proofs of Lemmas 1, 3, 4, 5 and Proposition 1

Proof of Proposition 1.

In part (a) of Theorem 1 it has been proven already that $\beta_{n}(b)\leq C_{0}bn^{-1/2}$ . Thus, when $b$ is fixed it is straightforward that the upper bound in the assertion of the theorem holds.

The lower bound developed in Lemma 1 reads $\beta_{n}(b)\geq\frac{1}{2}\sum_{j=1}^{n-1}a_{j+1}\cdot w_{j}(b)+\frac{1}{2}w_{n}(b)$ . Since we have $a_{j}<a_{j+1}$ for the equidistant grid and all $a_{j}$ and $w_{j}$ are non-negative, we may use the weaker inequality

[TABLE]

In the following we use Lemma 2 for a lower bound on terms $a_{j}$ and Result B.V for a lower bound on $w_{j}$ .

[TABLE]

where $C$ is a positive constant independent of $n$ (but dependent on $b$ ) that may vary from line to line. To arrive at (22) we use the convergence of the Riemann sum, noting that $b$ is fixed and that the function

[TABLE]

is integrable on $(0,1)$ . This concludes the proof. ∎

Proof of Lemma 1.

Notice that the events $\{\sup_{t\in[0,1]}B_{t}>b\}$ and $\{\tau_{b}\in(0,1]\}$ are equivalent. We thus find

[TABLE]

To prove the upper bound we use the fact that $\mathbb{P}(B_{t_{j}-s}<0,\ldots,B_{t_{n}-s}<0)$ is a non-increasing function of $s\in[t_{j-1},t_{j}]$ (see Appendix A, Transformation T2), so that

[TABLE]

Dividing both sides of the inequality by $w(b)=\mathbb{P}(\tau_{b}\in(0,1])$ gives $\beta_{T}(b)\leq\bar{\beta}_{T}(b)$ . To prove the lower bound we use Result B.IV from the Appendix, so as to obtain

[TABLE]

Dividing both sides of the inequality by $w(b)$ leads to $\beta_{T}(b)\geq\underline{$ \beta $}_{T}(b)$ and concludes the proof. ∎

Proof of Lemma 3.

Recall the definitions of $a_{j}(b)$ and $w_{j}(b)$ , and $\bar{\beta}_{T}(b):=\sum_{j=1}^{n}a_{j}(b)w_{j}(b)$ . Notice that if we put $t_{k}=\frac{k}{n}$ , then by the scaling property of Brownian Motion

[TABLE]

and thus $a_{1}<a_{2}<\ldots<a_{n}$ (since the $a_{j}(b)$ s are independent of $b$ , we abbreviate $a_{j}:=a_{j}(b)$ ).

Assume that for any $0<b_{1}<b_{2}$ there exists $k\in\{1,\ldots,n-1\}$ such that

[TABLE]

Since the weights $w_{j}(b)$ must satisfy $\sum_{j=1}^{n}w_{j}(b)=1$ we have $\sum_{j=1}^{n}\big{(}w_{j}(b_{2})-w_{j}(b_{1})\big{)}=0$ and thus

[TABLE]

Finally,

[TABLE]

For the remainder of the proof we prove the existence of $k\in\{1,\ldots,n-1\}$ satisfying (24). Let $\tau_{b}:=\inf\{t\geq 0:B_{t}\geq b\}$ be the first hitting time of level $b$ and let $f(b,t)$ be the density of $\tau_{b}$ given that $\tau_{b}\leq 1$ , i.e.,

[TABLE]

where $b>0$ , $t\in(0,1)$ , and $\phi(\cdot)$ denotes the density of a standard normal random variable. We will prove that for any $0<b_{1}<b_{2}$ there exists $t^{*}$ such that:

[TABLE]

Then the weights

[TABLE]

are decreasing for all $\frac{j}{n}\leq t^{*}$ , and increasing for all $\frac{j}{n}\geq\frac{1}{n}+t^{*}$ . If $nt^{*}$ is not an integer, it is not known whether $w_{[t^{*}n]+1}(b)$ increases or not, but for sure there exists $k\in\{1,\ldots,n-1\}$ satisfying (24). For the remainder we prove the existence of $t^{*}$ satisfying (25). For $t\in(0,1)$ :

[TABLE]

Note that $\lim_{t\to 0^{+}}g(t)=+\infty$ and $g(1)<0$ (for example due to the Result B.VII in the Appendix) and that $g(\cdot)$ is strictly decreasing, hence $g(\cdot)$ has exactly one zero $t^{*}$ and $g(t)>0$ for $t<t^{*}$ and $g(t)<0$ for $t>t^{*}$ . The observation that $\text{sign}(f(b_{1},t)-f(b_{2},t))=\text{sign}(g(t))$ concludes the proof. ∎

Proof of Lemma 4.

Let $h:=\max_{k=1,\ldots,n}(t_{k}-t_{k-1})$ . We transform the grid $T=\{t_{1},\ldots,t_{n}\}$ with Transformations T1–T3, see Appendix A, in such a way that after all transformations we end up with $\{h,\ldots,Nh\}$ .

Using Transformation T2, translate the grid to the right by $h-t_{1}$ , i.e., put

[TABLE] 2. 2.

Put $\sigma_{1}:=1$ , $c_{1}=1$ and $k:=2$ . While $k\leq N$ do:

•

Put $\sigma_{k}:=\inf\{j:t_{j}\geq kh\}$ .

•

Using Transformation T3, contract the grid after time $t_{\sigma_{k-1}}$ by a factor $c_{k}$ , where $c_{k}$ is defined by ${h}/({t_{\sigma_{k}}-t_{\sigma_{k-1}}})$ . Formally, we put

[TABLE]

Notice that after this operation $t_{\sigma_{k}}=kh$ .

•

Put $k:=k+1$ . 3. 3.

Using Transformation T1, delete all the points $t_{k}$ such that $t_{k}\not\in\{h,\ldots,hN\}$ .

Now we prove that the algorithm is well-defined, more precisely, we confirm that all $\sigma_{k}$ ’s exist. First, see that $\sigma_{1}$ is well-defined. By induction, assume that $\sigma_{k}$ is well-defined and prove that $\sigma_{k+1}$ is well-defined as well. Notice that after the $k$ th loop in Step 2 of the algorithm, the distances between the points shrunk at most by a factor $p_{k}=\prod_{j=1}^{k}c_{j}$ compared with the initial maximal distance $h$ . Moreover, we observe that

[TABLE]

We prove by induction that $p_{k}=\prod_{j=1}^{k}c_{j}\geq\frac{1}{k}$ for all $k\in\{1,\ldots,N\}$ . Obviously $p_{2}=c_{2}\geq\frac{1}{2}$ . Assume that $p_{k-1}\geq\frac{1}{k-1}$ . After multiplying inequality (26) by $p_{k-1}$ we obtain

[TABLE]

which ends the inductive proof. Next, in order to show that $\sigma_{k+1}$ is well defined for $k\in\{1,\ldots N-1\}$ it suffices to prove that the endpoint $t_{n}$ , after the $k$ th loop of Step 2, is greater than $h(k+1)$ . We prove a stronger statement, namely that the endpoint $t_{n}$ after being shrunk by a factor $p_{k}$ is still greater than $h(k+1)$ , i.e. $h(k+1)\leq t_{n}p_{k}$ . By the definition of $N$ , $h$ satisfies the inequality $h\leq{t_{n}}/{N^{2}}$ , thus

[TABLE]

which concludes the proof that $\sigma_{k+1}$ is well-defined. As all transformations used in steps 1-3 satisfy (30) we have

[TABLE]

We finish the proof by observing that $\mathbb{P}\big{(}B_{h}>0,\ldots,B_{Nh}>0\big{)}=\mathbb{P}\big{(}B_{1}>0,\ldots,B_{N}>0\big{)}$ , due to the scaling property of Brownian Motion. ∎

Proof of Lemma 5.

Notice that the grid points $t^{n}_{k}(b)$ defined in (9) depend only on the threshold $b$ and the ratio $\frac{k}{n}\in[0,1]$ . We are able to extend the definition of $t^{n}_{k}(b)$ to $t:(0,1]\times(0,\infty)\to[0,1]$ ,

[TABLE]

such that $t^{n}_{k}(b)=t(\frac{k}{n},b)$ . Equivalently, $t(s,b)$ can be defined as the unique solution to

[TABLE]

This extension makes it possible to inspect the derivative of $t^{n}_{k}(b)$ with respect to the ratio $\frac{k}{n}$ . Using the extension function of $t^{n}_{k}(b)$ , we aim to prove the more general statement that for $0<s_{1}<s_{2}<1$ ,

[TABLE]

Moreover, using the definition (27) we may substitute

[TABLE]

and arrive at another equivalent form of inequality (28):

[TABLE]

which is Result B.IX in the Appendix. For part (b) see that the density of the first hitting time,

[TABLE]

is an increasing function on the interval $s\in[0,\frac{b^{2}}{3}]$ and thus part (b) follows from the second definition of the grid points $t^{n}_{k}(b)$ in (10). ∎

Appendix A Grid transformations

Let $T=\{t_{1},\ldots,t_{n}\}$ with $0<t_{1}<\ldots<t_{n}<\infty$ . We introduce three grid transformations, i.e. operations $T\mapsto\widetilde{T}$ satisfying

[TABLE]

(T1)

Deleting. For any $k\in\{1,\ldots,n\}$

[TABLE] 2. (T2)

Translation to the right of the whole sequence. For any $s>0$

[TABLE] 3. (T3)

Contraction of time after some point. For any $k\in\{1,\ldots,n-1\}$ and $c\in(0,1)$ :

[TABLE]

Proof that Transformations T1-T3 satisfy (30).

Assertion T1 is straightforward to verify. Observe for T2 that

[TABLE]

and for T3 that

[TABLE]

∎

Appendix B Miscellaneous results

Let $\Phi(\cdot)$ denote the standard normal cumulative distribution function and $\phi(\cdot)$ the standard normal density function. Below we list various results that we use. Results B.I–B.III are standard, and not proven here.

(B.I)

For $x>0$ :

[TABLE] 2. (B.II)

As $x\to\infty$ ,

[TABLE] 3. (B.III)

[Szarek and Werner, 1999]. For $x>-1$ :

[TABLE] 4. (B.IV)

Let $0<t_{1}<\ldots<t_{n}<\infty$ , then:

[TABLE] 5. (B.V)

Let $T=\{t_{1},\ldots,t_{n}\}$ , where $t_{j}:=\frac{j}{n}$ , $\tau_{b}:=\inf\{t\geq 0:B_{t}\geq b\}$ and $b>0$ , then:

[TABLE]

and

[TABLE]

for $j\in\{2,\ldots n\}$ . 6. (B.VI)

Let $f:(0,\infty)\times(0,1)\to(0,\infty)$ such that

[TABLE]

Then $f(b,x)$ is an increasing function of $x$ , when $b\geq 1$ . 7. (B.VII)

Let $f:(0,\infty)\to(0,\infty)$ such that

[TABLE]

Then $f$ is a strictly decreasing function. 8. (B.VIII)

Let $f:(0,\infty)\to(0,\infty)$ such that

[TABLE]

Then $f$ is a strictly increasing function. 9. (B.IX)

Let $f:[0,1]\to[0,\infty)$ be such that

[TABLE]

Then $f$ is continuous and increasing.

B.1 Proofs of results B.IV–B.IX

Proof of B.IV.

The proof is very similar to the proofs from Appendix A. Note that

[TABLE]

which concludes the proof. ∎

Proof of B.V.

Using the mean value theorem and monotonicity of $\phi(\cdot)$ on the negative half-line, we have $|\Phi(-x)-\Phi(-y)|\leq|x-y|\cdot\phi(-y)$ for $0<y<x$ . Furthermore,

[TABLE]

Thus, for $b>0$ and $j\in\{2,\ldots,n\}$ , after substituting $t_{j}=\frac{j}{k}$ , the above combined with the inequality B.III yield:

[TABLE]

The proof of the second inequality is analogous. ∎

Proof of B.VI.

It suffices to prove that $\frac{d}{dx}f(b,x)\geq 0$ for $b\geq 1$ . See that

[TABLE]

Note that $g(x)$ has at most one root when $b\in[1,3]$ thus $g(x)\geq 0$ for $b\in[1,3]$ . Moreover, when $b>3$ , then $g^{\prime}(x)=8x-(b^{2}+3)<-1$ (for $x\in[0,1]$ ) thus $g(x)$ is strictly decreasing for $x\in[0,1]$ . From the observation that $g(0)=b^{2}>0$ and $g(1)=1>0$ we conclude that $g(x)$ is nonnegative on the interval $[0,1]$ for $b\geq 1$ and thus $\frac{d}{dx}f(b,x)\geq 0$ , when $b\geq 1$ . ∎

Proof of B.VII.

We have that

[TABLE]

thus $f^{\prime}(x)\leq 0$ iff $-\phi(x)+\Phi(-x)\,x\leq 0$ , which is equivalent to Result B.I. ∎

Proof of B.VIII.

See that

[TABLE]

thus $f^{\prime}(x)\geq 0$ iff $\frac{\Phi(-x)}{\phi(x)}\geq\frac{x}{1+x^{2}}$ , which is an implication of the lower bound from result B.III. ∎

Proof of B.IX.

It is easy to see that $\lim_{t\to 0^{+}}f(t)=0$ . To see that $\lim_{t\to 1^{-}}f(t)=\frac{2\,\Phi(-b)}{b\,\phi(b)}$ we expand $\log\big{(}\Phi(-b/\sqrt{t})\big{)}$ in a series around $t_{0}=1$ and obtain

[TABLE]

Thus

[TABLE]

To prove that $f$ is increasing we study the first derivative. For $t\in(0,1)$ :

[TABLE]

Due to Result B.VII we have the lower bound

[TABLE]

and thus the numerator of the fraction in (33) can be bounded from below by the function $g:(0,1)\to\mathbb{R}$ defined as below:

[TABLE]

Notice that $g(t)\geq 0$ implies $\frac{d}{dt}f(t)\geq 0$ which is exactly what we want to establish. For the remainder of the proof we show that $g(t)$ is non-negative. Since $\lim_{t\to 0^{+}}g(t)=+\infty$ and $g(1)=0$ , it suffices to show that $g^{\prime}(t)$ is monotone (non-increasing). We study the first derivative

[TABLE]

where the last inequality is a consequence of the application of Result B.VIII, that is

[TABLE]

is an increasing function of $t$ . ∎

Acknowledgments. The authors would like to thank Ankush Agarwal and Johan van Leeuwaarden for useful discussions and suggestions. In addition, the reviewers’ reports helped improving the quality of our work considerably. This work is part of the research programme ‘Rare Event Simulation for Climate Extremes’ with grant number 657.014.033, which is (partly) funded by the Netherlands Organisation for Scientific Research (NWO). Michel Mandjes’ research is partly funded by the NWO Gravitation Programme NETWORKS, grant number 024.002.003.

Bibliography19

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[Adler, 1990] Adler, R. (1990). An Introduction to Continuity, Extrema, and Related Topics for General Gaussian Processes . IMS Lecture Series. Institute of Mathematical Statistics.
2[Adler et al., 2012] Adler, R. J., Blanchet, J. H., and Liu, J. (2012). Efficient Monte Carlo for high excursions of Gaussian random fields. The Annals of Applied Probability , 22(3):1167–1214.
3[Alili et al., 2005] Alili, L., Patie, P., and Pedersen, J. L. (2005). Representations of the first hitting time density of an Ornstein-Uhlenbeck process. Stochastic Models , 21(4):967–980.
4[Asmussen et al., 1995] Asmussen, S., Glynn, P., and Pitman, J. (1995). Discretization error in simulation of one-dimensional reflecting Brownian motion. The Annals of Applied Probability , 5:875–896.
5[Broadie et al., 1997] Broadie, M., Glasserman, P., and Kou, S. (1997). A continuity correction for discrete barrier options. Mathematical Finance , 7(4):325–349.
6[Calvin, 1997] Calvin, J. M. (1997). Average performance of a class of adaptive algorithms for global optimization. The Annals of Applied Probability , 7:711–730.
7[Calvin and Glynn, 1997] Calvin, J. M. and Glynn, P. W. (1997). Average case behavior of random search for the maximum. Journal of Applied Probability , 34(3):632–642.
8[Dębicki and Mandjes, 2003] Dębicki, K. and Mandjes, M. (2003). Exact overflow asymptotics for queues with many gaussian inputs. Journal of Applied Probability , 40(3):704–720.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Controlling the time discretization bias

Controlling the time discretization bias

Abstract

1 Introduction

2 Preliminary results

Definition 1**.**

Definition 2**.**

Proposition 1**.**

3 Equidistant family of grids for Brownian Motion

Theorem 1**.**

Lemma 1**.**

Lemma 2**.**

Lemma 3**.**

Proof of Theorem 1(a).

Proof of Theorem 1(b).

4 Threshold-dependent grids for Brownian Motion

Theorem 2**.**

Lemma 4**.**

Lemma 5**.**

Proof of Theorem 2.

Remark 1**.**

5 Numerical algorithm for estimation of w(b)w(b)w(b)

Algorithm 1** ([Adler et al., 2012]).**

Corollary 1** (Strongly efficient algorithm for the estimation of w(b)w(b)w(b)).**

Proof.

Algorithm 2** ([Doucet, 2010]).**

6 Efficient grids for a broad class of stochastic processes

Example 1** (Brownian Motion with jumps).**

Example 2** (Ornstein-Uhlenbeck Process).**

Example 3** (Fractional Brownian Motion).**

7 Concluding remarks and discussion

8 Proofs of Lemmas 1, 3, 4, 5 and Proposition 1

Proof of Proposition 1.

Proof of Lemma 1.

Proof of Lemma 3.

Proof of Lemma 4.

Proof of Lemma 5.

Appendix A Grid transformations

Proof that Transformations T1-T3 satisfy (30).

Appendix B Miscellaneous results

B.1 Proofs of results B.IV–B.IX

Proof of B.IV.

Proof of B.V.

Proof of B.VI.

Proof of B.VII.

Proof of B.VIII.

Proof of B.IX.

Definition 1.

Definition 2.

Proposition 1.

Theorem 1.

Lemma 1.

Lemma 2.

Lemma 3.

Theorem 2.

Lemma 4.

Lemma 5.

Remark 1.

5 Numerical algorithm for estimation of $w(b)$

Algorithm 1 ([Adler et al., 2012]).

Corollary 1 (Strongly efficient algorithm for the estimation of $w(b)$ ).

Algorithm 2 ([Doucet, 2010]).

Example 1 (Brownian Motion with jumps).

Example 2 (Ornstein-Uhlenbeck Process).

Example 3 (Fractional Brownian Motion).