Semi-tractability of optimal stopping problems via a weighted stochastic mesh algorithm
D. Belomestny, M. Kaledin, J. Schoenmakers

TL;DR
This paper introduces a Weighted Stochastic Mesh algorithm for approximating optimal stopping problems, demonstrating semi-tractability in discrete cases and providing tight complexity bounds in continuous cases, supported by numerical examples.
Contribution
The paper presents a novel Weighted Stochastic Mesh algorithm that achieves semi-tractability for discrete optimal stopping problems and offers the tightest known complexity bounds for continuous problems.
Findings
Discrete case complexity bounded by ε^{-4} log^{d+2}(1/ε)
Continuous case bounds are the tightest known
Numerical example validates theoretical results
Abstract
In this article we propose a Weighted Stochastic Mesh (WSM) Algorithm for approximating the value of a discrete and continuous time optimal stopping problem. We prove that in the discrete case the WSM algorithm leads to semi-tractability of the corresponding optimal problems in the sense that its complexity is bounded in order by with being the dimension of the underlying Markov chain. Furthermore we study the WSM approach in the context of continuous time optimal stopping problems and derive the corresponding complexity bounds. Although we can not prove semi-tractability in this case, our bounds turn out to be the tightest ones among the bounds known for the existing algorithms in the literature. We illustrate our theoretical findings by a numerical example.
| LS | WSM | QTM |
|---|---|---|
| LS | WSM | QTA |
|---|---|---|
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic processes and financial applications · Risk and Portfolio Optimization · Insurance, Mortality, Demography, Risk Management
Semi-tractability of optimal stopping problems via a weighted stochastic mesh algorithm
D. Belomestny, M. Kaledin and J. Schoenmakers
Abstract
In this article we propose a Weighted Stochastic Mesh (WSM) Algorithm for approximating the value of a discrete and continuous time optimal stopping problem. We prove that in the discrete case the WSM algorithm leads to semi-tractability of the corresponding optimal problems in the sense that its complexity is bounded in order by with being the dimension of the underlying Markov chain. Furthermore we study the WSM approach in the context of continuous time optimal stopping problems and derive the corresponding complexity bounds. Although we can not prove semi-tractability in this case, our bounds turn out to be the tightest ones among the bounds known for the existing algorithms in the literature. We illustrate our theoretical findings by a numerical example.
1 Introduction
The theory of optimal stopping is concerned with the problem of choosing a time to take a particular action, in order to maximize an expected reward or minimize an expected cost. Such problems can be found in many areas of statistics, economics, and mathematical finance (e.g. the pricing problem of American options). Primal and dual approaches have been developed in the literature giving rise to Monte Carlo algorithms for high-dimensional discrete time stopping problems. Solving high-dimensional discrete optimal stopping problems is usually based on a backward dynamic programming principle which is in some sense contradictory to the forward nature of Monte Carlo simulation. Much research was focused on the development of fast methods to compute approximations to the optimal value function. Most of these methods are based on some type of regression on Monte Carlo paths, see [4] for an overview. One of the most widely adopted regression algorithms by practitioners is the Longstaff-Schwartz algorithm. It is based on approximating conditional expectations by least-squares regression on a given basis of functions. Longstaff and Schwartz [13] demonstrated the efficiency of their least-squares approach through a number of numerical examples, and in [6] and [17] general convergence properties of the method were established. In particular, it follows from Corollary 3.10 in [17] that for a fixed number of stopping opportunities and a popular choice of polynomial basis functions of degree less or equal to , the error of estimating the corresponding value function at one point is of order
[TABLE]
where is the number of paths used to perform regression, is related to smoothness of the corresponding conditional expectation operator, is dimension of the underlying state space. On the other hand, the computational cost of the least-squares MC algorithm is of order due to the computation of a (random) pseudo-inverse at every stopping date. After balancing the variance and the approximation errors in (1), one obtains that complexity of the least-squares approach, that is, the (minimal) number of “elementary” evaluations needed to construct an approximation for the value function with accuracy is bounded up to a constant not depending on by
[TABLE]
This implies
[TABLE]
Furthermore, if we next want to construct an approximation for a continuous time optimal stopping problem, then we need to let resulting in the complexity bound
[TABLE]
where it is assumed that the error due to the time discretization is of order for some independent of This implies that
[TABLE]
showing that complexity of the least squares algorithms for continuous optimal stopping problems may even grow faster than . Similar complexity bounds can be derived for other simulation based approximation algorithms, see [9] for a novel nested type MC approach with complexity depending polynomially on and exponentially in
We call a problem semi-tractable if there is an algorithm to solve it with complexity satisfying
[TABLE]
Our definition of tractability should be contrasted to the definition in [14] where a problem is said to be (weakly) tractable, if there is an algorithm to solve it with complexity satisfying
[TABLE]
This definition seems to be counter-intuitive as it renders a problem with, for example, an algorithmic complexity of order to be (weakly) tractable while an algorithm with complexity is not. In our setting the dimension is typically fixed and the complexity rate with respect to is of primary importance. In this paper we show that the discrete time optimal stopping problems are semi-tractable in the sense of (4). To this end we revisit the mesh method of Broadie and Glasserman [5]. By enhancing it with a suitable regularisation, we prove that under mild conditions, the complexity of the resulting WSM (Weighted Stochastic Mesh) algorithm satisfies (4), provided the transition densities of the underlying Markov chain are analytically known or can be well approximated. Our algorithm bears some similarity to the random grid algorithm of Rust [15]. However, Rust [15] studied the Markovian decision problems in discrete time with compact state space. Let us also remark that a complete convergence as well as complexity analysis of the mesh method is still missing in the literature, for some preliminary results see Agarwal and Juneja [1]. In the case of continuous time optimal stopping problems we need not to assume that the transition densities are known but can use the Gaussian transition densities of the corresponding Euler scheme. This results in an algorithm which has complexity of order for some constant Although this does not imply semi-tractability of continuous time optimal stopping problems, the proposed algorithm is very simple and its complexity remains provably polynomial in as opposite to the least squares approaches. To compare different algorithms for continuous time optimal stopping problems, we introduce the so-called semi-tractability index
[TABLE]
It turns out that the WSM algorithm has the smallest semi-tractability index among existing algorithms for continuous time optimal stopping problems.
The paper is organized as follows. A description of the proposed algorithm is given in Section 2. Section 2.2 is devoted to convergence and complexity analysis of our algorithm. In Section 3 we turn to continuous time optimal stopping problems. All proofs are collected in Section 5.
2 Discrete time optimal stopping problems
We begin with the description of the WSM algorithm for discrete time optimal stopping problems. Let us assume a finite set of stopping dates for some natural and let be a Markov chain in adapted to a filtration For a given set of nonnegative reward functions on we then consider the discrete Snell envelope process:
[TABLE]
where stands for the set of -stopping times with values in the set and stands for the -conditional expectation, and the measurable functions exist due to Markovianity of the process
For simplicity and without loss of generality we assume that the Markov chain is time homogeneous with -steps transition density denoted by and one-step density denoted by so that
[TABLE]
Fix some and assume that It is well known that the Snell envelope (6) satisfies the dynamic program principle,
[TABLE]
Next we fix some and define a truncated version of the above dynamic program via
[TABLE]
where Thus, by construction, vanishes outside the ball Also by construction it holds that
[TABLE]
which is easily seen by backward induction. In view of (8) we may write
[TABLE]
Now assume that we have a set of trajectories with simulated according to the one-step transition density and consider the approximation:
[TABLE]
where in view of the Chapman-Kolmogorov equation
[TABLE]
Hence we have approximately
[TABLE]
We thus propose the following algorithm. We start with
[TABLE]
for Once is constructed on the grid for we set
[TABLE]
for By construction, each function vanishes outside the ball Working all the way down to results in the approximation:
[TABLE]
for As such the presented algorithm is closely related to the mesh method of Broadie and Glasserman [5] apart from truncation at level and a special choice of weights.
2.1 Cost estimation
Let us estimate the cost of carrying out the backward dynamic program (11). One needs to compute for all This can be done at a cost of order where is the cost of evaluating a (typical) function of arguments. In the typical situation is proportional to The evaluation of
[TABLE]
for has a cost of order with being the cost of an elementary numerical operation, which is negligible if So the overall cost of carrying out the backward dynamic program (11) is of order
2.2 Error and complexity analysis
In this section we analyze convergence of the WSM estimate (11) to the solution of the discrete optimal stopping problem (6) for and a fixed as Let us first bound a distance between and
Proposition 1
With
[TABLE]
* it holds that*
[TABLE]
Proposition 2
Suppose that
[TABLE]
and that
[TABLE]
Suppose further that for some and
[TABLE]
for all One then has
[TABLE]
Next we control the discrepancy between and
Proposition 3
With
[TABLE]
and such that it holds that
[TABLE]
Corollary 4
Under the assumptions of Proposition 2, we have for (17) the estimate
[TABLE]
where the last inequality follows from for any Then by combining (16) with Proposition 3 we obtain the error estimate,
[TABLE]
Proposition 5
Under the assumptions of Proposition 2 the complexity of the WSM algorithm is bounded from above by
[TABLE]
where and are natural constants and stands for the cost of computing the transition density at one point
Corollary 6
For a fixed the discrete time optimal stopping problem (6) with and satisfying (13), (14) and (15) is semi-tractable, provided that the complexity of computing the transition density at one point is at most polynomial in Different approximation algorithms for discrete time optimal stopping problems can be compared using the semi-tractability index (5). For example, it follows from (3) that the semi-tractability index of the least-squares (LS) approach is equal to Hence it tends to [math] as the smoothness of the problem increases. Moreover from inspection of Theorem 2.4 in [3], we see that the Quantisation Tree (QT) method has semi-tractability index
2.3 Approximation of the transition density
A crucial condition for semi-tractability to hold is availability of the transition density of the chain in closed form. However it can be shown that if a sequence of approximating densities converging to can be constructed in such a way that
[TABLE]
for some and a sequence then under proper assumptions on the growth of and the cost of computing (in fact it should be at most polynomial in ), one can derive a complexity bound satisfying
[TABLE]
To construct a sequence of approximations satisfying the assumption (20), one can use various small-time expansions for transition densities of stochastic processes, see, for example, [2] and [12]. Let us exemplify this type of approximation in the case of one-dimensional diffusion processes of the form:
[TABLE]
where is a bounded function, twice continuously differentiable, with bounded derivatives and is a function with three continuous and bounded derivatives such that there exist two positive constants with Consider a Markov chain defined as a time discretization of that is, for some Under the above conditions the following representation for the (one-step) transition density of the chain is proved in [8] (see also [7] for more general setting):
[TABLE]
with
[TABLE]
where is a standard Brownian bridge, and
[TABLE]
By expanding the exponent in (21) into Taylor series, we get for small enough
[TABLE]
with
[TABLE]
If is uniformly bounded by a constant , then the above series converges uniformly in and for all small enough. Set
[TABLE]
It obviously holds for and
[TABLE]
uniformly for all Hence the assumption (20) is satisfied with provided that for some depending only on Similarly if then (20) holds. To sample from we can use the well-known acceptance rejection method which does not require the exact knowledge of a scaling factor .
3 Continuous time optimal stopping for diffusions
In this section we consider diffusion processes of the form
[TABLE]
where and are Lipschitz continuous and is a -dimensional standard Wiener process on a probability space . As usual, the (augmented) filtration generated by is denoted by We are interested in solving optimal stopping problems of the form:
[TABLE]
where is a given real valued function on and stands for the set of stopping times taking values in . The problem (24) is related to the so-called free boundary problem for the corresponding partial differential equation. Let us introduce the differential operator :
[TABLE]
where
[TABLE]
We denote by (or ) the solution of (23) starting at moment from Denote by a regular solution of the following system of partial differential inequalities:
[TABLE]
then under some mild conditions (see, e.g. [10])
[TABLE]
that is,
With this notation established, it is worth discussing the main issue that we are going to address in this section. Our goal is to estimate at a given point with accuracy less than by an algorithm with complexity which is polynomial in . As already mentioned in the introduction some well known algorithms such as the regression ones fail to achieve this goal (at least according to the existing complexity bounds in the literature).
Let us introduce the Snell envelope process:
[TABLE]
where (somewhat more general than in (24)) is a given nonnegative function on In the first step we perform a time discretization by introducing a finite set of stopping dates with and some natural number, and next consider the discretized Snell envelope process:
[TABLE]
where stands for the set of stopping times with values in the set Note that the measurable functions exist due to Markovianity of the process The error due to the time discretization is well studied in the literature. We will rely on the following result which is implied by Thm. 2.1 in [3] for instance.
Proposition 7
Let be Lipschitz continuous and Then one has that
[TABLE]
where the constants depend on the Lipschitz constants for and respectively.
In order to achieve an acceptable discretization error we choose a sufficiently large and then concentrate on the computation of
In the next step we approximate the underlying process using some strong discretization scheme on the time grid yielding an approximation It is assumed that the one step transition densities of this scheme are explicitly known. The simplest and the most popular scheme is the Euler scheme,
[TABLE]
which in general has strong convergence order and the one-step transition density of the chain is given by
[TABLE]
with and Now we will turn to the discrete time optimal stopping problem with possible stopping times . To this end we introduce the discrete time Markov chain adapted to the filtration and (while abusing notation slightly) and consider the discretized Snell envelope process
[TABLE]
where stands for the set of stopping indices with values in and the measurable functions (or ) exist due to Markovianity of the process (or ). The distance between and is controlled by the next proposition.
Proposition 8
There exists a constant depending on the Lipschitz constants of and such that
[TABLE]
Thus, combining Proposition 7 and Proposition 8 yields.
Corollary 9
If is constructed by the Euler scheme with time step size where is the number of discretization steps, then under the conditions of Proposition 7 and Proposition 8 we have that
[TABLE]
where stands for inequality up to constant depending on and
Since the transition densities of the Euler scheme are explicitly known (see (29)), the WSM algorithm can be directly used for constructing an approximation based on the paths of the Markov chain To derive the complexity bounds of the resulting estimate, we shall make the following assumptions.
(AG)
Suppose that is such that
[TABLE]
(AX)
Assume that there exists a constant such that for all
[TABLE]
uniformly in (hence ). This assumption is satisfied under Lipschitz conditions on the coefficients of the SDE (23), and can be proved using the Burkholder-Davis-Gundy inequality and the Gronwall lemma.
(AP)
Assume furthermore that is time homogeneous with transition densities that satisfy the Aronson type inequality: there exist positive constants and such that for any and any it holds that
[TABLE]
This assumption holds if the coefficients in (23) are bounded and is uniformly elliptic.
The next proposition provides complexity bounds for the WSM algorithm in the case of continuous time optimal stopping problems.
Proposition 10
Assume that the assumptions (AG), (AX) and (AP) hold, then
- •
the cost of computing in (30) for a fixed with precision via the WSM algorithm is bounded above by
[TABLE]
- •
the cost of computing with an accuracy via the WSM algorithm is bounded by
[TABLE]
The first statement follows directly from Proposition 5 by taking in (19), and Then by setting we obtain (35) (with possibly modified natural constants ).
Discussion
As can be seen from (35),
[TABLE]
and this shows the efficiency of the proposed algorithm as compared to the existing algorithms for continuous time optimal stopping problems at least as far as the semi-tractability index is concerned. Indeed, the only algorithm available in the literature with a provably finite limit of type (36) is the quantization tree algorithm (QTA) of Bally, Pagès, and Printems [3]. Indeed, by tending the number of stopping times and the quantization number to infinity such that the corresponding errors in Thm. 2.4-b in [3] are balanced, we derive the following complexity upper bound
[TABLE]
Hence
4 Numerical experiments
In the following experiments we illustrate the WSM algorithm in the case of continuous time optimal stopping problems. Lower bounds for the WSM algorithm can be obtained using a suboptimal policy computed on an independent set of trajectories. This policy can be constructed either directly via (10) or by using interpolation of the likelihood weights
[TABLE]
The fastest and simplest way to do this is to use the nearest neighbour interpolation based on training set of trajectories, in all experiments below the number of neighbours was set to
4.1 An American put on a single asset
In order to illustrate the performance of the WSM algorithm in continuous time, we consider a financial problem of pricing American put option on a single log-Brownian asset
[TABLE]
with denoting the riskless rate of interest, assumed to be constant, and denoting the constant volatility. The payoff function is given by and a fair price of the option is given by
[TABLE]
No closed-form solution for the price of this option is known, but there are various numerical methods which give accurate approximations to . The parameter values used are . An accurate estimate for the true price obtained via a binomial tree type algorithm is (see [11]). In Figure 1 we show lower bounds due to WSM, the least squares approaches of Longstaff and Schwartz [13] (LS) and value function regression algorithm of Tsitsiklis and Van Roy [16] (VF) as functions of the number of stopping times forming a uniform grid on These lower bounds are constructed using a suboptimal stopping rule due to estimated continuation values evaluated on a new independent set of trajectories. The maximal degree of polynomials used as basis functions in LS and VF are indicated by the numbers ( and ) in legend. As can be seen WSM lower bounds are more stable when increases. The VF lower bounds seem to diverge as
5 Proofs
5.1 Proof of Proposition 1
For the statement reads
[TABLE]
so then it is true. Suppose (12) is true for Then, by using and the fact that vanishes for
[TABLE]
Hence we have by induction,
[TABLE]
5.2
Proof of Proposition 2
Combining the assumptions (32) and (33) yields,
[TABLE]
Using
[TABLE]
we get (note that ),
[TABLE]
for ( for ). Now by (12), i.e. Proposition 1, we get
[TABLE]
whence the estimate (16).
5.3 Proof of Proposition 3
Let us write the sample based backward dynamic program (11) for step in the form,
[TABLE]
by defining the weights
[TABLE]
where is fixed and suppressed. Let us further abbreviate
[TABLE]
for a generic Borel function Using,
[TABLE]
(38), and we thus get
[TABLE]
using that the weights in (39) sum up to one. One thus gets by iterating (40),
[TABLE]
since Let us now introduce
[TABLE]
and consider the generic term
[TABLE]
Due to (9) one has,
[TABLE]
and due to (39) and (42) we may write,
[TABLE]
and so obtain,
[TABLE]
We are now going to estimate
[TABLE]
It holds that
[TABLE]
with
[TABLE]
Now consider the i.i.d. random variables,
[TABLE]
which have zero mean. Then, by Cauchy-Schwartz one has that
[TABLE]
Concerning Term2, let us write
[TABLE]
where is an independent dummy trajectory. We thus have
[TABLE]
where for the random variables
[TABLE]
are i.i.d. and have zero mean. We so have by Cauchy-Schwartz again,
[TABLE]
Secondly, one has
[TABLE]
Next it follows that
[TABLE]
Further, one obviously has that and since
[TABLE]
By now taking the expectation in (41) and gathering all together we obtain,
[TABLE]
assuming that is taken such that
5.4 Proof of Proposition 5
In order to achieve a required accuracy let us take and large enough such that both error terms in (18) are equal to Hence, we first take
[TABLE]
that is when Then take, with denoting asymptotic equivalence for up to some natural constant,
[TABLE]
Thus, the computational work load (complexity) is given by
[TABLE]
where is a natural constant. Now let us write
[TABLE]
Then, using the elementary estimate for and assuming that (44) implies (19).
5.5 Proof of Proposition 8
On the one hand one has
[TABLE]
and on the other one has similarly
[TABLE]
Hence we get
[TABLE]
due to the strong order of the Euler scheme, with being some Lipschitz constant for
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Ankush Agarwal and Sandeep Juneja. Comparing optimal convergence rate of stochastic mesh and least squares method for bermudan option pricing. In Proceedings of the 2013 Winter Simulation Conference: Simulation: Making Decisions in a Complex World , pages 701–712. IEEE Press, 2013.
- 2[2] Robert Azencott. Densité des diffusions en temps petit: développements asymptotiques. I. In Seminar on probability, XVIII , volume 1059 of Lecture Notes in Math. , pages 402–498. Springer, Berlin, 1984.
- 3[3] Vlad Bally, Gilles Pagès, and Jacques Printems. A quantization tree method for pricing and hedging multidimensional American options. Math. Finance , 15(1):119–168, 2005.
- 4[4] Denis Belomestny and John Schoenmakers. Advanced Simulation-Based Methods for Optimal Stopping and Control: With Applications in Finance . Springer, 2018.
- 5[5] M. Broadie and P. Glasserman. A stochastic mesh method for pricing high-dimensional American options. Journal of Computational Finance , 7(4):35–72, 2004.
- 6[6] Emmanuelle Clément, Damien Lamberton, and Philip Protter. An analysis of a least squares regression method for american option pricing. Finance and Stochastics , 6(4):449–471, 2002.
- 7[7] D. Dacunha-Castelle and D. Florens-Zmirou. Estimation of the coefficients of a diffusion from discrete observations. Stochastics , 19(4):263–284, 1986.
- 8[8] Daniéle Florens-Zmirou. On estimating the diffusion coefficient from discrete observations. J. Appl. Probab. , 30(4):790–804, 1993.
