Semi-tractability of optimal stopping problems via a weighted stochastic   mesh algorithm

D. Belomestny; M. Kaledin; J. Schoenmakers

arXiv:1906.09431·q-fin.CP·June 25, 2019

Semi-tractability of optimal stopping problems via a weighted stochastic mesh algorithm

D. Belomestny, M. Kaledin, J. Schoenmakers

PDF

Open Access

TL;DR

This paper introduces a Weighted Stochastic Mesh algorithm for approximating optimal stopping problems, demonstrating semi-tractability in discrete cases and providing tight complexity bounds in continuous cases, supported by numerical examples.

Contribution

The paper presents a novel Weighted Stochastic Mesh algorithm that achieves semi-tractability for discrete optimal stopping problems and offers the tightest known complexity bounds for continuous problems.

Findings

01

Discrete case complexity bounded by ε^{-4} log^{d+2}(1/ε)

02

Continuous case bounds are the tightest known

03

Numerical example validates theoretical results

Abstract

In this article we propose a Weighted Stochastic Mesh (WSM) Algorithm for approximating the value of a discrete and continuous time optimal stopping problem. We prove that in the discrete case the WSM algorithm leads to semi-tractability of the corresponding optimal problems in the sense that its complexity is bounded in order by $ε^{- 4} lo g^{d + 2} (1/ ε)$ with $d$ being the dimension of the underlying Markov chain. Furthermore we study the WSM approach in the context of continuous time optimal stopping problems and derive the corresponding complexity bounds. Although we can not prove semi-tractability in this case, our bounds turn out to be the tightest ones among the bounds known for the existing algorithms in the literature. We illustrate our theoretical findings by a numerical example.

Tables2

Table 1. Table 1 : Semi-tractability index Γ Γ \Gamma of different algorithms for discrete time optimal stopping problems

LS	WSM	QTM
$3 / α$	$0$	$2$

Table 2. Table 2: Semi-tractability index Γ Γ \Gamma of different algorithms for continuous time optimal stopping problems.

LS	WSM	QTA
$\infty$	$2$	$6$

Equations250

5^{L} (\frac{m ^{d}}{N} + \frac{1}{m ^{α}}),

5^{L} (\frac{m ^{d}}{N} + \frac{1}{m ^{α}}),

C_{L} (ε, d) = \frac{L 5 ^{L (2 + 3 d / α)}}{ε ^{2 + 3 d / α}} .

C_{L} (ε, d) = \frac{L 5 ^{L (2 + 3 d / α)}}{ε ^{2 + 3 d / α}} .

d ↗ \infty lim sup ε ↘ 0 lim sup \frac{lo g C _{L} ( ε , d )}{d lo g ( ε ^{- 1} )} = 3/ α .

d ↗ \infty lim sup ε ↘ 0 lim sup \frac{lo g C _{L} ( ε , d )}{d lo g ( ε ^{- 1} )} = 3/ α .

C_{\infty} (ε, d) = O (\frac{ε ^{- 1/ β} 5 ^{(2 + 3 d / α) ε^{- 1/ β}}}{ε ^{2 + 3 d / α}}),

C_{\infty} (ε, d) = O (\frac{ε ^{- 1/ β} 5 ^{(2 + 3 d / α) ε^{- 1/ β}}}{ε ^{2 + 3 d / α}}),

ε ↘ 0 lim \frac{lo g C _{\infty} ( ε , d )}{lo g ( 1/ ε )} = \infty,

ε ↘ 0 lim \frac{lo g C _{\infty} ( ε , d )}{lo g ( 1/ ε )} = \infty,

d ↗ \infty lim ε ↘ 0 lim \frac{lo g C ( ε , d )}{d lo g ( 1/ ε )} = 0.

d ↗ \infty lim ε ↘ 0 lim \frac{lo g C ( ε , d )}{d lo g ( 1/ ε )} = 0.

d + ε^{- 1} ↗ \infty lim \frac{lo g C ( ε , d )}{d + ε ^{- 1}} = 0.

d + ε^{- 1} ↗ \infty lim \frac{lo g C ( ε , d )}{d + ε ^{- 1}} = 0.

Γ = def d ↗ \infty lim sup ε ↘ 0 lim sup \frac{lo g C ( ε , d )}{d lo g ( 1/ ε )} .

Γ = def d ↗ \infty lim sup ε ↘ 0 lim sup \frac{lo g C ( ε , d )}{d lo g ( 1/ ε )} .

U_{l} = U_{l} (Z_{l}) = def τ \in T_{l, L} esssup E_{l} [g_{τ} (Z_{τ})],

U_{l} = U_{l} (Z_{l}) = def τ \in T_{l, L} esssup E_{l} [g_{τ} (Z_{τ})],

P [Z_{k + 1} \in d y ∣ Z_{k} = x] = p (y ∣ x) d y .

P [Z_{k + 1} \in d y ∣ Z_{k} = x] = p (y ∣ x) d y .

U_{L} (Z_{L}) = g_{L} (Z_{L}),

U_{L} (Z_{L}) = g_{L} (Z_{L}),

U_{l} (Z_{l}) = max {g_{l} (Z_{l}), E [U_{l + 1} (Z_{l + 1}) ∣ Z_{l}]}, l = 0, \dots, L - 1.

U_{L} (Z_{L}) = g_{L} (Z_{L}) \cdot \mathbbm 1_{Z_{L} \in B_{R}},

U_{L} (Z_{L}) = g_{L} (Z_{L}) \cdot \mathbbm 1_{Z_{L} \in B_{R}},

U_{l} (Z_{l}) = max {g_{l} (Z_{l}), E [U_{l + 1} (Z_{l + 1}) Z_{l}]} \cdot \mathbbm 1_{Z_{l} \in B_{R}}, l = 0, \dots, L - 1,

∥ U_{l} ∥_{\infty} \leq G_{R} = def 0 \leq l \leq L max z \in B_{R} sup g_{l} (z),

∥ U_{l} ∥_{\infty} \leq G_{R} = def 0 \leq l \leq L max z \in B_{R} sup g_{l} (z),

E [U_{l + 1} (Z_{l + 1}) Z_{l} = x] = \int U_{l + 1} (y) \frac{p ( y ∣ x )}{p _{l + 1} ( y ∣ x _{0} )} p_{l + 1} (y ∣ x_{0}) d y .

E [U_{l + 1} (Z_{l + 1}) Z_{l} = x] = \int U_{l + 1} (y) \frac{p ( y ∣ x )}{p _{l + 1} ( y ∣ x _{0} )} p_{l + 1} (y ∣ x_{0}) d y .

E [U_{l + 1} (Z_{l + 1}) Z_{l} = x] \approx \frac{1}{N} n = 1 \sum N U_{l + 1} (Z_{l + 1}^{(n)}) \frac{p ( Z _{l + 1}^{(n)} ∣ x )}{p _{l + 1} ( Z _{l + 1}^{(n)} ∣ x _{0} )},

E [U_{l + 1} (Z_{l + 1}) Z_{l} = x] \approx \frac{1}{N} n = 1 \sum N U_{l + 1} (Z_{l + 1}^{(n)}) \frac{p ( Z _{l + 1}^{(n)} ∣ x )}{p _{l + 1} ( Z _{l + 1}^{(n)} ∣ x _{0} )},

p_{l + 1} (Z_{l + 1}^{(n)} ∣ x_{0}) = \int p (Z_{l + 1}^{(n)} ∣ z) p_{l} (z ∣ x_{0}) d z \approx \frac{1}{N} m = 1 \sum N p (Z_{l + 1}^{(n)} ∣ Z_{l}^{(m)}) .

p_{l + 1} (Z_{l + 1}^{(n)} ∣ x_{0}) = \int p (Z_{l + 1}^{(n)} ∣ z) p_{l} (z ∣ x_{0}) d z \approx \frac{1}{N} m = 1 \sum N p (Z_{l + 1}^{(n)} ∣ Z_{l}^{(m)}) .

E [U_{l + 1} (Z_{l + 1}) Z_{l} = x] \approx n = 1 \sum N U_{l + 1} (Z_{l + 1}^{(n)}) \frac{p ( Z _{l + 1}^{(n)} ∣ x )}{\sum _{m = 1}^{N} p ( Z _{l + 1}^{(n)} ∣ Z _{l}^{(m)} )} .

E [U_{l + 1} (Z_{l + 1}) Z_{l} = x] \approx n = 1 \sum N U_{l + 1} (Z_{l + 1}^{(n)}) \frac{p ( Z _{l + 1}^{(n)} ∣ x )}{\sum _{m = 1}^{N} p ( Z _{l + 1}^{(n)} ∣ Z _{l}^{(m)} )} .

\overline{U}_{L} (Z_{L}^{(n)}) = def g_{L} (Z_{L}^{(n)}) \mathbbm 1_{Z_{L}^{(n)} \in B_{R}}

\overline{U}_{L} (Z_{L}^{(n)}) = def g_{L} (Z_{L}^{(n)}) \mathbbm 1_{Z_{L}^{(n)} \in B_{R}}

\overline{U}_{l} (Z_{l}^{(r)}) = def max {g_{l} (Z_{l}^{(r)}), n = 1 \sum N \overline{U}_{l + 1}^{(n)} (Z_{l + 1}^{(n)}) \frac{p ( Z _{l + 1}^{(n)} ∣ Z _{l}^{(r)} )}{\sum _{m = 1}^{N} p ( Z _{l + 1}^{(n)} ∣ Z _{l}^{(m)} )}} \mathbbm 1_{Z_{l}^{(r)} \in B_{R}},

\overline{U}_{l} (Z_{l}^{(r)}) = def max {g_{l} (Z_{l}^{(r)}), n = 1 \sum N \overline{U}_{l + 1}^{(n)} (Z_{l + 1}^{(n)}) \frac{p ( Z _{l + 1}^{(n)} ∣ Z _{l}^{(r)} )}{\sum _{m = 1}^{N} p ( Z _{l + 1}^{(n)} ∣ Z _{l}^{(m)} )}} \mathbbm 1_{Z_{l}^{(r)} \in B_{R}},

\overline{U}_{0} = max [g_{0} (x_{0}), n = 1 \sum N \overline{U}_{1}^{(n)} (Z_{1}^{(n)}) \frac{p ( Z _{1}^{(n)} ∣ x _{0} )}{\sum _{m = 1}^{N} p ( Z _{1}^{(n)} ∣ x _{0} )}]

\overline{U}_{0} = max [g_{0} (x_{0}), n = 1 \sum N \overline{U}_{1}^{(n)} (Z_{1}^{(n)}) \frac{p ( Z _{1}^{(n)} ∣ x _{0} )}{\sum _{m = 1}^{N} p ( Z _{1}^{(n)} ∣ x _{0} )}]

\frac{1}{N} m = 1 \sum N p (Z_{l + 1}^{(n)} ∣ Z_{l}^{(m)})

\frac{1}{N} m = 1 \sum N p (Z_{l + 1}^{(n)} ∣ Z_{l}^{(m)})

ε_{l, R} = def \int_{∣ x - x_{0} ∣ > R} U_{l} (x) p_{l} (x ∣ x_{0}) d x

ε_{l, R} = def \int_{∣ x - x_{0} ∣ > R} U_{l} (x) p_{l} (x ∣ x_{0}) d x

\int\bigl{|}U_{l}(x)-\widetilde{U}_{l}(x)\bigr{|}p_{l}(x|x_{0})\,dx\leq\sum_{j=l}^{L}\varepsilon_{j,R}.

\int\bigl{|}U_{l}(x)-\widetilde{U}_{l}(x)\bigr{|}p_{l}(x|x_{0})\,dx\leq\sum_{j=l}^{L}\varepsilon_{j,R}.

0 \leq l \leq L max g_{l} (x) \leq c_{g} (1 + ∣ x ∣), x \in R^{d}

0 \leq l \leq L max g_{l} (x) \leq c_{g} (1 + ∣ x ∣), x \in R^{d}

E [l \leq l^{'} \leq L max ∣ Z_{l^{'}} ∣ Z_{l} = x] \leq c_{Z} (1 + ∣ x ∣), x \in R^{d} .

E [l \leq l^{'} \leq L max ∣ Z_{l^{'}} ∣ Z_{l} = x] \leq c_{Z} (1 + ∣ x ∣), x \in R^{d} .

p_{l} (y ∣ x) \leq \frac{ϰ}{( 2 π α l ) ^{d /2}} e^{\frac{∣ x - y ∣ ^{2}}{2 α l}} .

p_{l} (y ∣ x) \leq \frac{ϰ}{( 2 π α l ) ^{d /2}} e^{\frac{∣ x - y ∣ ^{2}}{2 α l}} .

\int\bigl{|}U_{l}(x)-\widetilde{U}_{l}(x)\bigr{|}p_{l}(x|x_{0})\,dx\\ \leq Lc_{g}\varkappa\left(1+c_{Z}+c_{Z}\left|x_{0}\right|+c_{Z}\sqrt{d\alpha L}\right)2^{d/4}e^{-\frac{R^{2}}{8\alpha L}}.

\int\bigl{|}U_{l}(x)-\widetilde{U}_{l}(x)\bigr{|}p_{l}(x|x_{0})\,dx\\ \leq Lc_{g}\varkappa\left(1+c_{Z}+c_{Z}\left|x_{0}\right|+c_{Z}\sqrt{d\alpha L}\right)2^{d/4}e^{-\frac{R^{2}}{8\alpha L}}.

F_{R}^{2} = def \int \int_{∣ y - x_{0} ∣ \leq R} \frac{p ^{2} ( y ∣ x )}{p _{l + 1} ( y ∣ x _{0} )} p_{l} (x ∣ x_{0}) d x d y,

F_{R}^{2} = def \int \int_{∣ y - x_{0} ∣ \leq R} \frac{p ^{2} ( y ∣ x )}{p _{l + 1} ( y ∣ x _{0} )} p_{l} (x ∣ x_{0}) d x d y,

\mathsf{E}\left[\bigl{|}\overline{U}_{0}-\widetilde{U}_{0}\bigr{|}\right]\leq\left(3+\sqrt{2}\right)LG_{R}\frac{1+F_{R}}{\sqrt{N}}.

\mathsf{E}\left[\bigl{|}\overline{U}_{0}-\widetilde{U}_{0}\bigr{|}\right]\leq\left(3+\sqrt{2}\right)LG_{R}\frac{1+F_{R}}{\sqrt{N}}.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic processes and financial applications · Risk and Portfolio Optimization · Insurance, Mortality, Demography, Risk Management

Full text

Semi-tractability of optimal stopping problems via a weighted stochastic mesh algorithm

D. Belomestny, M. Kaledin and J. Schoenmakers

Abstract

In this article we propose a Weighted Stochastic Mesh (WSM) Algorithm for approximating the value of a discrete and continuous time optimal stopping problem. We prove that in the discrete case the WSM algorithm leads to semi-tractability of the corresponding optimal problems in the sense that its complexity is bounded in order by $\varepsilon^{-4}\log^{d+2}(1/\varepsilon)$ with $d$ being the dimension of the underlying Markov chain. Furthermore we study the WSM approach in the context of continuous time optimal stopping problems and derive the corresponding complexity bounds. Although we can not prove semi-tractability in this case, our bounds turn out to be the tightest ones among the bounds known for the existing algorithms in the literature. We illustrate our theoretical findings by a numerical example.

1 Introduction

The theory of optimal stopping is concerned with the problem of choosing a time to take a particular action, in order to maximize an expected reward or minimize an expected cost. Such problems can be found in many areas of statistics, economics, and mathematical finance (e.g. the pricing problem of American options). Primal and dual approaches have been developed in the literature giving rise to Monte Carlo algorithms for high-dimensional discrete time stopping problems. Solving high-dimensional discrete optimal stopping problems is usually based on a backward dynamic programming principle which is in some sense contradictory to the forward nature of Monte Carlo simulation. Much research was focused on the development of fast methods to compute approximations to the optimal value function. Most of these methods are based on some type of regression on Monte Carlo paths, see [4] for an overview. One of the most widely adopted regression algorithms by practitioners is the Longstaff-Schwartz algorithm. It is based on approximating conditional expectations by least-squares regression on a given basis of functions. Longstaff and Schwartz [13] demonstrated the efficiency of their least-squares approach through a number of numerical examples, and in [6] and [17] general convergence properties of the method were established. In particular, it follows from Corollary 3.10 in [17] that for a fixed number $L$ of stopping opportunities and a popular choice of polynomial basis functions of degree less or equal to $m$ , the error of estimating the corresponding value function at one point is of order

[TABLE]

where $N$ is the number of paths used to perform regression, $\alpha\geq 1$ is related to smoothness of the corresponding conditional expectation operator, $d$ is dimension of the underlying state space. On the other hand, the computational cost of the least-squares MC algorithm is of order $Nm^{2d}L$ due to the computation of a (random) pseudo-inverse at every stopping date. After balancing the variance and the approximation errors in (1), one obtains that complexity of the least-squares approach, that is, the (minimal) number of “elementary” evaluations needed to construct an approximation for the value function with accuracy $\varepsilon,$ is bounded up to a constant not depending on $L$ by

[TABLE]

This implies

[TABLE]

Furthermore, if we next want to construct an approximation for a continuous time optimal stopping problem, then we need to let $L\rightarrow\infty$ resulting in the complexity bound

[TABLE]

where it is assumed that the error due to the time discretization is of order $L^{-\beta}$ for some $0<\beta<1,$ independent of $d.$ This implies that

[TABLE]

showing that complexity of the least squares algorithms for continuous optimal stopping problems may even grow faster than $\exp(1/\varepsilon)$ . Similar complexity bounds can be derived for other simulation based approximation algorithms, see [9] for a novel nested type MC approach with complexity depending polynomially on $d$ and exponentially in $1/\varepsilon.$

We call a problem semi-tractable if there is an algorithm to solve it with complexity $\mathcal{C}(\varepsilon,d)$ satisfying

[TABLE]

Our definition of tractability should be contrasted to the definition in [14] where a problem is said to be (weakly) tractable, if there is an algorithm to solve it with complexity $\mathcal{C}(\varepsilon,d)$ satisfying

[TABLE]

This definition seems to be counter-intuitive as it renders a problem with, for example, an algorithmic complexity of order $d^{2}\exp(1/\left(\varepsilon\log\log...\log\varepsilon^{-1}\right))$ to be (weakly) tractable while an algorithm with complexity $2^{d}/\varepsilon$ is not. In our setting the dimension $d$ is typically fixed and the complexity rate with respect to $\varepsilon$ is of primary importance. In this paper we show that the discrete time optimal stopping problems are semi-tractable in the sense of (4). To this end we revisit the mesh method of Broadie and Glasserman [5]. By enhancing it with a suitable regularisation, we prove that under mild conditions, the complexity of the resulting WSM (Weighted Stochastic Mesh) algorithm satisfies (4), provided the transition densities of the underlying Markov chain are analytically known or can be well approximated. Our algorithm bears some similarity to the random grid algorithm of Rust [15]. However, Rust [15] studied the Markovian decision problems in discrete time with compact state space. Let us also remark that a complete convergence as well as complexity analysis of the mesh method is still missing in the literature, for some preliminary results see Agarwal and Juneja [1]. In the case of continuous time optimal stopping problems we need not to assume that the transition densities are known but can use the Gaussian transition densities of the corresponding Euler scheme. This results in an algorithm which has complexity of order $O(c^{d}\varepsilon^{-(2d+14)})$ for some constant $c>1.$ Although this does not imply semi-tractability of continuous time optimal stopping problems, the proposed algorithm is very simple and its complexity remains provably polynomial in $\varepsilon$ as opposite to the least squares approaches. To compare different algorithms for continuous time optimal stopping problems, we introduce the so-called semi-tractability index

[TABLE]

It turns out that the WSM algorithm has the smallest semi-tractability index among existing algorithms for continuous time optimal stopping problems.

The paper is organized as follows. A description of the proposed algorithm is given in Section 2. Section 2.2 is devoted to convergence and complexity analysis of our algorithm. In Section 3 we turn to continuous time optimal stopping problems. All proofs are collected in Section 5.

2 Discrete time optimal stopping problems

We begin with the description of the WSM algorithm for discrete time optimal stopping problems. Let us assume a finite set of stopping dates $\left\{0,\ldots,L\right\},$ for some natural $L>0,$ and let $(Z_{l},$ $l=0,\ldots,L)$ be a Markov chain in $\mathbb{R}^{d},$ adapted to a filtration $\left(\mathcal{F}_{l},\,l=0,\ldots,L\right).$ For a given set of nonnegative reward functions $g_{l},$ $l=0,\ldots,L,$ on $\mathbb{R}^{d},$ we then consider the discrete Snell envelope process:

[TABLE]

where $\mathcal{T}_{l,L}$ stands for the set of $\mathcal{F}$ -stopping times with values in the set $\{l,\ldots,L\},$ and $\mathsf{E}_{l}:=\mathsf{E}_{\mathcal{F}_{l}}$ stands for the $\mathcal{F}_{l}$ -conditional expectation, and the measurable functions $U_{l}(\cdot)$ exist due to Markovianity of the process $(Z_{l})_{l\geq 0}.$

For simplicity and without loss of generality we assume that the Markov chain $(Z_{l})_{l\geq 0}$ is time homogeneous with $l$ -steps transition density denoted by $p_{l}(y|x)$ and one-step density denoted by $p(y|x)=p_{1}(y|x),$ so that

[TABLE]

Fix some $x_{0}\in\mathbb{R}^{d}$ and assume that $Z_{0}=x_{0}.$ It is well known that the Snell envelope (6) satisfies the dynamic program principle,

[TABLE]

Next we fix some $R>0$ and define a truncated version of the above dynamic program via

[TABLE]

where $B_{R}\overset{\text{def}}{=}\left\{z:\left|z-x_{0}\right|\leq R\right\}.$ Thus, by construction, $\widetilde{U}_{l}$ vanishes outside the ball $B_{R}.$ Also by construction it holds that

[TABLE]

which is easily seen by backward induction. In view of (8) we may write

[TABLE]

Now assume that we have a set of trajectories $Z_{l}^{(n)},$ $l=0,\ldots,L,$ with $Z_{0}^{(n)}=x_{0},$ $n=1,\ldots,N,$ simulated according to the one-step transition density $p,$ and consider the approximation:

[TABLE]

where in view of the Chapman-Kolmogorov equation

[TABLE]

Hence we have approximately

[TABLE]

We thus propose the following algorithm. We start with

[TABLE]

for $n=1,\ldots,N.$ Once $\overline{U}_{l+1}$ is constructed on the grid for $0<l+1\leq L,$ we set

[TABLE]

for $r=1,\ldots,N.$ By construction, each function $\overline{U}_{l}$ vanishes outside the ball $B_{R}.$ Working all the way down to $l=0$ results in the approximation:

[TABLE]

for $U_{0}.$ As such the presented algorithm is closely related to the mesh method of Broadie and Glasserman [5] apart from truncation at level $R$ and a special choice of weights.

2.1 Cost estimation

Let us estimate the cost of carrying out the backward dynamic program (11). One needs to compute $p(Z_{l+1}^{(n)}|Z_{l}^{(m)})$ for all $l=1,\ldots,L,$ $n,$ $m=1,\ldots,N.$ This can be done at a cost of order $N^{2}Lc_{f}^{(d)},$ where $c_{f}^{(d)}$ is the cost of evaluating a (typical) function of $2d$ arguments. In the typical situation $c_{f}^{(d)}$ is proportional to $d.$ The evaluation of

[TABLE]

for $l=1,...,L,$ $n=1,...,N,$ has a cost of order $N^{2}Lc_{\ast}$ with $c_{\ast}$ being the cost of an elementary numerical operation, which is negligible if $c_{\ast}\ll c_{f}^{(d)}.$ So the overall cost of carrying out the backward dynamic program (11) is of order $N^{2}Lc_{f}^{(d)}.$

2.2 Error and complexity analysis

In this section we analyze convergence of the WSM estimate (11) to the solution of the discrete optimal stopping problem (6) for $l=0$ and a fixed $x_{0}\in\mathbb{R}^{d}$ as $N\to\infty.$ Let us first bound a distance between $U_{l}$ and $\widetilde{U}_{l},$ $l=0,\ldots,L.$

Proposition 1

With

[TABLE]

$l=0,\ldots,L,$ * it holds that*

[TABLE]

Proposition 2

Suppose that

[TABLE]

and that

[TABLE]

Suppose further that for some $\varkappa,$ $\alpha>0,$ and $l=1,\ldots,L,$

[TABLE]

for all $x,y\in\mathbb{R}^{d}.$ One then has

[TABLE]

Next we control the discrepancy between $\overline{U}_{0}$ and $\widetilde{U}_{0}.$

Proposition 3

With

[TABLE]

and $N$ such that $\left(1+F_{R}\right)/\sqrt{N}<1,$ it holds that

[TABLE]

Corollary 4

Under the assumptions of Proposition 2, we have for (17) the estimate

[TABLE]

where the last inequality follows from $\Gamma\left(1+a\right)\geq a^{a}e^{-a}$ for any $a\geq 1/2.$ Then by combining (16) with Proposition 3 we obtain the error estimate,

[TABLE]

Proposition 5

Under the assumptions of Proposition 2 the complexity of the WSM algorithm is bounded from above by

[TABLE]

where $c_{1}>0$ and $c_{2}>1$ are natural constants and $c_{f}^{(d)}$ stands for the cost of computing the transition density $p_{l}(y|x)$ at one point $(x,y).$

Corollary 6

For a fixed $L>0$ the discrete time optimal stopping problem (6) with $g$ and $(Z_{l})_{l\geq 0}$ satisfying (13), (14) and (15) is semi-tractable, provided that the complexity of computing the transition density $p_{l}(y|x)$ at one point $(x,y)$ is at most polynomial in $d.$ Different approximation algorithms for discrete time optimal stopping problems can be compared using the semi-tractability index (5). For example, it follows from (3) that the semi-tractability index of the least-squares (LS) approach is equal to $3/\alpha.$ Hence it tends to [math] as the smoothness of the problem increases. Moreover from inspection of Theorem 2.4 in [3], we see that the Quantisation Tree (QT) method has semi-tractability index $2.$

2.3 Approximation of the transition density

A crucial condition for semi-tractability to hold is availability of the transition density $p(y|x)$ of the chain $(Z_{l})_{l\geq 0}$ in closed form. However it can be shown that if a sequence of approximating densities $p^{n}(y|x),$ $n\in\mathbb{N},$ converging to $p(y|x)$ can be constructed in such a way that

[TABLE]

for some $m\in\mathbb{N}$ and a sequence $R_{n}\nearrow\infty,$ $n\nearrow\infty,$ then under proper assumptions on the growth of $R_{n}$ and the cost of computing $p^{n}$ (in fact it should be at most polynomial in $d$ ), one can derive a complexity bound $\mathcal{C}(\varepsilon,d)$ satisfying

[TABLE]

To construct a sequence of approximations $p^{n}(y|z)$ satisfying the assumption (20), one can use various small-time expansions for transition densities of stochastic processes, see, for example, [2] and [12]. Let us exemplify this type of approximation in the case of one-dimensional diffusion processes of the form:

[TABLE]

where $b$ is a bounded function, twice continuously differentiable, with bounded derivatives and $\sigma$ is a function with three continuous and bounded derivatives such that there exist two positive constants $\sigma_{\circ},\sigma^{\circ}$ with $\sigma_{\circ}\leq\sigma(x)\leq\sigma^{\circ}.$ Consider a Markov chain $(Z_{l})_{l\geq 0}$ defined as a time discretization of $(X_{t})_{t\geq 0},$ that is, $Z_{l}\overset{\text{def}}{=}X_{\Delta l},$ $l=0,1,2,\ldots$ for some $\Delta>0.$ Under the above conditions the following representation for the (one-step) transition density $p$ of the chain $Z$ is proved in [8] (see also [7] for more general setting):

[TABLE]

with $U_{\Delta}(x,y)=R_{\Delta}(x,y)\exp\left[\int_{0}^{x}\bar{b}(z)\,dz-\int_{0}^{y}\bar{b}(z)\,dz\right],$

[TABLE]

where $B_{z}$ is a standard Brownian bridge, $s(x)=\int_{0}^{x}\frac{dy}{\sigma(y)},$ $g=s^{-1}$ and

[TABLE]

By expanding the exponent in (21) into Taylor series, we get for $\Delta$ small enough

[TABLE]

with

[TABLE]

If $\bar{\rho}$ is uniformly bounded by a constant $D>0$ , then the above series converges uniformly in $x$ and $y$ for all $\Delta$ small enough. Set

[TABLE]

It obviously holds $p^{n}(y|x)>0$ for $\Delta<\Delta_{0}(D)$ and

[TABLE]

uniformly for all $x,y\in\mathbb{R}.$ Hence the assumption (20) is satisfied with $m=0,$ provided that $\Delta<\Delta_{0}$ for some $\Delta_{0}$ depending only on $D.$ Similarly if $\bar{\rho}\leq 0,$ then (20) holds. To sample from $p^{n}$ we can use the well-known acceptance rejection method which does not require the exact knowledge of a scaling factor $\int p^{n}(y|x)\,dy$ .

3 Continuous time optimal stopping for diffusions

In this section we consider diffusion processes of the form

[TABLE]

where $b:$ $\mathbb{R}^{d}\rightarrow\mathbb{R}^{d}$ and $\sigma:$ $\mathbb{R}^{d}\rightarrow\mathbb{R}^{d\times m},$ are Lipschitz continuous and $W=(W^{1},\ldots,W^{m})$ is a $m$ -dimensional standard Wiener process on a probability space $(\Omega,\mathcal{F},P)$ . As usual, the (augmented) filtration generated by $(W_{s})_{s\geq 0}$ is denoted by $(\mathcal{F}_{s})_{s\geq 0}.$ We are interested in solving optimal stopping problems of the form:

[TABLE]

where $f$ is a given real valued function on $\mathbb{R}^{d},$ $r\geq 0,$ and $\mathcal{T}_{t,T}$ stands for the set of stopping times $\tau$ taking values in $[t,T]$ . The problem (24) is related to the so-called free boundary problem for the corresponding partial differential equation. Let us introduce the differential operator $L_{t}$ :

[TABLE]

where

[TABLE]

We denote by $X_{s}^{t,x}$ (or $X^{t,x}(s)$ ) $,\;s\geq T,$ the solution of (23) starting at moment $t$ from $x:\;X_{t}^{t,x}=x.$ Denote by $u(t,x)$ a regular solution of the following system of partial differential inequalities:

[TABLE]

then under some mild conditions (see, e.g. [10])

[TABLE]

that is, $u(t,x)=U_{t}^{\star}(x).$

With this notation established, it is worth discussing the main issue that we are going to address in this section. Our goal is to estimate $u(t,x)$ at a given point $(t_{0},x_{0})$ with accuracy less than $\varepsilon$ by an algorithm with complexity $\mathcal{C}^{\star}(\varepsilon,d)$ which is polynomial in $1/\varepsilon$ . As already mentioned in the introduction some well known algorithms such as the regression ones fail to achieve this goal (at least according to the existing complexity bounds in the literature).

Let us introduce the Snell envelope process:

[TABLE]

where (somewhat more general than in (24)) $g$ is a given nonnegative function on $\mathbb{R}_{\geq 0}\times\mathbb{R}^{d}.$ In the first step we perform a time discretization by introducing a finite set of stopping dates $t_{l}=lh,$ $l=1,\ldots,L,$ with $h=T/L$ and $L$ some natural number, and next consider the discretized Snell envelope process:

[TABLE]

where $\mathcal{T}_{l,L}$ stands for the set of stopping times with values in the set $\{t_{l},\ldots,t_{L}\}.$ Note that the measurable functions $U_{t_{l}}^{\circ}(\cdot)$ exist due to Markovianity of the process $X.$ The error due to the time discretization is well studied in the literature. We will rely on the following result which is implied by Thm. 2.1 in [3] for instance.

Proposition 7

Let $g:[0,T]\times\mathbb{R}^{d}\rightarrow$ $\mathbb{R}$ be Lipschitz continuous and $p\geq 1.$ Then one has that

[TABLE]

where the constants $c_{\circ},C_{\circ}>0$ depend on the Lipschitz constants for $b,\sigma,$ and $g,$ respectively.

In order to achieve an acceptable discretization error we choose a sufficiently large $L,$ and then concentrate on the computation of $U^{\circ}.$

In the next step we approximate the underlying process $X$ using some strong discretization scheme on the time grid $t_{i}=iT/L,$ $i=0,\ldots,L,$ yielding an approximation $\overline{X}.$ It is assumed that the one step transition densities of this scheme are explicitly known. The simplest and the most popular scheme is the Euler scheme,

[TABLE]

$i=1,\ldots,d,$ which in general has strong convergence order $1/2,$ and the one-step transition density of the chain $(\overline{X}_{t_{l+1}})_{l\geq 0}$ is given by

[TABLE]

with $\Sigma=\sigma\sigma^{\top}\in\mathbb{R}^{d\times d}$ and $h=T/L.$ Now we will turn to the discrete time optimal stopping problem with possible stopping times $\{t_{l}=lh,$ $l=0,\ldots,L\}$ . To this end we introduce the discrete time Markov chain $Z_{l}\overset{\text{def}}{=}\overline{X}_{t_{l}}$ adapted to the filtration $(\mathcal{F}_{l})\overset{\text{def}}{=}(\mathcal{F}_{t_{l}}),$ and $g_{l}(x)\overset{\text{def}}{=}g(t_{l},x)$ (while abusing notation slightly) and consider the discretized Snell envelope process

[TABLE]

where $\mathcal{I}_{l,L}$ stands for the set of stopping indices with values in $\{l,\ldots,L\},$ and the measurable functions $U_{t_{l}}(\cdot)$ (or $U_{l}(\cdot)$ ) exist due to Markovianity of the process $\overline{X}$ (or $Z$ ). The distance between $U$ and $U^{\circ}$ is controlled by the next proposition.

Proposition 8

There exists a constant $C^{\text{Euler}}>0$ depending on the Lipschitz constants of $b,\sigma,$ and $g,$ such that

[TABLE]

Thus, combining Proposition 7 and Proposition 8 yields.

Corollary 9

If $\overline{X}$ is constructed by the Euler scheme with time step size $h=T/L,$ where $L$ is the number of discretization steps, then under the conditions of Proposition 7 and Proposition 8 we have that

[TABLE]

where $\lesssim$ stands for inequality up to constant depending on $c_{\circ},C_{\circ}$ and $C^{\text{Euler}}.$

Since the transition densities of the Euler scheme are explicitly known (see (29)), the WSM algorithm can be directly used for constructing an approximation $\overline{U}_{0}(x_{0})$ based on the paths of the Markov chain $(Z_{l}).$ To derive the complexity bounds of the resulting estimate, we shall make the following assumptions.

(AG)

Suppose that $c_{g}>0$ is such that

[TABLE]

(AX)

Assume that there exists a constant $c_{\bar{X}}>0$ such that for all $0\leq l\leq L,$

[TABLE]

uniformly in $L$ (hence $h$ ). This assumption is satisfied under Lipschitz conditions on the coefficients of the SDE (23), and can be proved using the Burkholder-Davis-Gundy inequality and the Gronwall lemma.

(AP)

Assume furthermore that $\left(\overline{X}_{lh},\text{ }l=0,\ldots,L\right)$ is time homogeneous with transition densities $\overline{p}_{lh}(y|x)$ that satisfy the Aronson type inequality: there exist positive constants $\overline{\varkappa}$ and $\overline{\alpha}$ such that for any $x,y\in\mathbb{R}^{d}$ and any $l>0,$ it holds that

[TABLE]

This assumption holds if the coefficients in (23) are bounded and $\sigma$ is uniformly elliptic.

The next proposition provides complexity bounds for the WSM algorithm in the case of continuous time optimal stopping problems.

Proposition 10

Assume that the assumptions (AG), (AX) and (AP) hold, then

•

the cost of computing $U_{0}(x_{0})$ in (30) for a fixed $L>0$ with precision $\varepsilon>0$ via the WSM algorithm is bounded above by

[TABLE]

•

the cost of computing $U_{0}^{\star}(x_{0})$ with an accuracy $\varepsilon>0$ via the WSM algorithm is bounded by

[TABLE]

The first statement follows directly from Proposition 5 by taking in (19), $\alpha=\overline{\alpha}h,$ $c_{Z}=c_{\bar{X}},$ and $L=T/h.$ Then by setting $h\asymp\varepsilon^{2}$ we obtain (35) (with possibly modified natural constants $c_{1},c_{2}$ ).

Discussion

As can be seen from (35),

[TABLE]

and this shows the efficiency of the proposed algorithm as compared to the existing algorithms for continuous time optimal stopping problems at least as far as the semi-tractability index is concerned. Indeed, the only algorithm available in the literature with a provably finite limit of type (36) is the quantization tree algorithm (QTA) of Bally, Pagès, and Printems [3]. Indeed, by tending the number of stopping times and the quantization number to infinity such that the corresponding errors in Thm. 2.4-b in [3] are balanced, we derive the following complexity upper bound

[TABLE]

Hence $\Gamma_{\text{QTA}}=6.$

4 Numerical experiments

In the following experiments we illustrate the WSM algorithm in the case of continuous time optimal stopping problems. Lower bounds for the WSM algorithm can be obtained using a suboptimal policy computed on an independent set of trajectories. This policy can be constructed either directly via (10) or by using interpolation of the likelihood weights

[TABLE]

The fastest and simplest way to do this is to use the nearest neighbour interpolation based on training set of trajectories, in all experiments below the number of neighbours was set to $500.$

4.1 An American put on a single asset

In order to illustrate the performance of the WSM algorithm in continuous time, we consider a financial problem of pricing American put option on a single log-Brownian asset

[TABLE]

with $r$ denoting the riskless rate of interest, assumed to be constant, and $\sigma$ denoting the constant volatility. The payoff function is given by $g(x)=(K-x)^{+}$ and a fair price of the option is given by

[TABLE]

No closed-form solution for the price of this option is known, but there are various numerical methods which give accurate approximations to $V_{0}$ . The parameter values used are $r=0.08,$ $\sigma=0.20,$ $\delta=0,$ $K=100,$ $T=3$ . An accurate estimate for the true price obtained via a binomial tree type algorithm is $6.9320$ (see [11]). In Figure 1 we show lower bounds due to WSM, the least squares approaches of Longstaff and Schwartz [13] (LS) and value function regression algorithm of Tsitsiklis and Van Roy [16] (VF) as functions of the number of stopping times $L$ forming a uniform grid on $[0,T].$ These lower bounds are constructed using a suboptimal stopping rule due to estimated continuation values evaluated on a new independent set of trajectories. The maximal degree of polynomials used as basis functions in LS and VF are indicated by the numbers ( $2$ and $4$ ) in legend. As can be seen WSM lower bounds are more stable when $L$ increases. The VF lower bounds seem to diverge as $L\rightarrow\infty.$

5 Proofs

5.1 Proof of Proposition 1

For $l=L$ the statement reads

[TABLE]

so then it is true. Suppose (12) is true for $0<l+1\leq L.$ Then, by using $\left|\max(a,b)-\max(a,c)\right|\leq|b-c|$ and the fact that $\widetilde{U}_{l}(x)$ vanishes for $\left|x-x_{0}\right|>R,$

[TABLE]

Hence we have by induction,

[TABLE]

5.2

Proof of Proposition 2

Combining the assumptions (32) and (33) yields,

[TABLE]

Using

[TABLE]

we get (note that $\left(4/3\right)^{1/2}<2^{1/4}$ ),

[TABLE]

for $l\geq 1$ ( $\varepsilon_{0,R}=0$ for $R>0$ ). Now by (12), i.e. Proposition 1, we get

[TABLE]

whence the estimate (16).

5.3 Proof of Proposition 3

Let us write the sample based backward dynamic program (11) for step $l<L$ in the form,

[TABLE]

by defining the weights

[TABLE]

where $l$ is fixed and suppressed. Let us further abbreviate

[TABLE]

for a generic Borel function $f\geq 0.$ Using,

[TABLE]

(38), and $\left|\max(a,b)-\max(a,c)\right|\leq|b-c|,$ we thus get

[TABLE]

using that the weights in (39) sum up to one. One thus gets by iterating (40),

[TABLE]

since $\overline{U}_{L}-\widetilde{U}_{L}=0.$ Let us now introduce

[TABLE]

and consider the generic term

[TABLE]

Due to (9) one has,

[TABLE]

and due to (39) and (42) we may write,

[TABLE]

and so obtain,

[TABLE]

We are now going to estimate

[TABLE]

It holds that

[TABLE]

with

[TABLE]

Now consider the i.i.d. random variables,

[TABLE]

which have zero mean. Then, by Cauchy-Schwartz one has that

[TABLE]

Concerning Term2, let us write

[TABLE]

where $Z^{0,x_{0}}$ is an independent dummy trajectory. We thus have

[TABLE]

where for $j=2,...,N,$ the random variables

[TABLE]

are i.i.d. and have zero mean. We so have by Cauchy-Schwartz again,

[TABLE]

Secondly, one has

[TABLE]

Next it follows that

[TABLE]

Further, one obviously has that $E_{R}^{2}\leq 2+2F_{R}^{2},$ and $H_{R}\leq 1+F_{R}^{2}$ since

[TABLE]

By now taking the expectation in (41) and gathering all together we obtain,

[TABLE]

assuming that $N$ is taken such that $(1+F_{R})/\sqrt{N}<1.$

5.4 Proof of Proposition 5

In order to achieve a required accuracy $\varepsilon>0,$ let us take $R$ and $N$ large enough such that both error terms in (18) are equal to $\varepsilon/2.$ Hence, we first take

[TABLE]

that is $R\nearrow\infty$ when $d+\varepsilon^{-1}\nearrow\infty.$ Then take, with $\asymp$ denoting asymptotic equivalence for $R\nearrow\infty$ up to some natural constant,

[TABLE]

Thus, the computational work load (complexity) is given by

[TABLE]

where $c_{1}$ is a natural constant. Now let us write

[TABLE]

Then, using the elementary estimate $\left(a+b\sqrt{d}\right)^{1/d}\leq ae^{b/a},$ for $a,b>0,$ $d\geq 1,$ and assuming that $\varepsilon<1,$ (44) implies (19).

5.5 Proof of Proposition 8

On the one hand one has

[TABLE]

and on the other one has similarly

[TABLE]

Hence we get

[TABLE]

due to the strong order of the Euler scheme, with $L_{g}$ being some Lipschitz constant for $g.$

Bibliography17

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Ankush Agarwal and Sandeep Juneja. Comparing optimal convergence rate of stochastic mesh and least squares method for bermudan option pricing. In Proceedings of the 2013 Winter Simulation Conference: Simulation: Making Decisions in a Complex World , pages 701–712. IEEE Press, 2013.
2[2] Robert Azencott. Densité des diffusions en temps petit: développements asymptotiques. I. In Seminar on probability, XVIII , volume 1059 of Lecture Notes in Math. , pages 402–498. Springer, Berlin, 1984.
3[3] Vlad Bally, Gilles Pagès, and Jacques Printems. A quantization tree method for pricing and hedging multidimensional American options. Math. Finance , 15(1):119–168, 2005.
4[4] Denis Belomestny and John Schoenmakers. Advanced Simulation-Based Methods for Optimal Stopping and Control: With Applications in Finance . Springer, 2018.
5[5] M. Broadie and P. Glasserman. A stochastic mesh method for pricing high-dimensional American options. Journal of Computational Finance , 7(4):35–72, 2004.
6[6] Emmanuelle Clément, Damien Lamberton, and Philip Protter. An analysis of a least squares regression method for american option pricing. Finance and Stochastics , 6(4):449–471, 2002.
7[7] D. Dacunha-Castelle and D. Florens-Zmirou. Estimation of the coefficients of a diffusion from discrete observations. Stochastics , 19(4):263–284, 1986.
8[8] Daniéle Florens-Zmirou. On estimating the diffusion coefficient from discrete observations. J. Appl. Probab. , 30(4):790–804, 1993.