Branching Particle Pricers with Heston Examples
Michael A. Kouritzin, Anne MacKay

TL;DR
This paper evaluates sequential Monte Carlo methods, especially particle branching algorithms, for efficient path-dependent option pricing under the Heston model, demonstrating significant performance improvements over other resampling techniques.
Contribution
It introduces and advocates for the effective particle branching algorithm within importance-sampling Monte Carlo methods for improved path-dependent option pricing.
Findings
Particle branching algorithms outperform resampling in certain cases.
Explicit solutions and importance sampling enhance simulation efficiency.
Numerical comparisons validate the recommended approach.
Abstract
The use of sequential Monte Carlo within simulation for path-dependent option pricing is proposed and evaluated. Recently, it was shown that explicit solutions and importance sampling are valuable for efficient simulation of spot price and volatility, especially for purposes of path-dependent option pricing. The resulting simulation algorithm is an analog to the weighted particle filtering algorithm that might be improved by resampling or branching. Indeed, some branching algorithms are shown herein to improve pricing performance substantially while some resampling algorithms are shown to be less suitable in certain cases. A historical property is given and explained as the distinguishing feature between the sequential Monte Carlo algorithms that work on path-dependent option pricing and those that do not. In particular, it is recommended to use the so-called effective particle…
| PS1 | PS2 | PS3 | |
| 100 | 100 | 100 | |
| 0.02 | 0.02 | 0.02 | |
| 0.085 | 0.424 | 0.225 | |
| 6.21 | 6.00 | 2.86 | |
| 0.2 | 0.8 | 0.6 | |
| -0.7 | -0.75 | -0.96 | |
| 0.501 | 0.11 | 0.07 | |
| 9 | 3 | 3 | |
| 8.50 | 2.65 | 2.50 |
| PS1 | PS2 | PS3 | |
|---|---|---|---|
| 2 | 6 | 6 | |
| 50 | 50 | 50 | |
| 1.05 | 1.05 | 1.05 | |
| 1.055 | 1.05 | 1.045 | |
| 0.2 | 2 | 1.5 | |
| 2 | 4 | 6 | 8 | |
| PS1 | ||||
| RMSE | 0.0017 | 0.0022 | 0.0026 | 0.0022 |
| Run time | 3.98 | 7.14 | 10.32 | 13.50 |
| PS2 | ||||
| RMSE | 0.4223 | 0.0315 | 0.0177 | 0.0044 |
| Run time | 1.90 | 2.99 | 4.09 | 5.21 |
| PS3 | ||||
| RMSE | 0.2054 | 0.0105 | 0.0047 | 0.0050 |
| Run time | 1.90 | 2.98 | 4.09 | 5.21 |
| PS2 | PS3 | |||
| Branching | Bootstrap | Branching | Bootstrap | |
| 3.0 | 2.0 | 3.0 | 2.0 | |
| 0.05 | 0.05 | 0.05 | 0.10 | |
| 2.0 | 1.0 | 2.0 | 1.0 | |
| 0.10 | 0.10 | 0.10 | 0.10 | |
| 1.0 | 0.5 | 1.0 | 0.5 | |
| 0.05 | 0.05 | 0.10 | 0.10 | |
| 0.5 | 0.25 | 0.8 | 0.5 | |
| 0.02 | 0.02 | 0.10 | 0.10 | |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Branching Particle Pricers with Heston Examples
Michael A. Kouritzin
Michael A. Kouritzin
Department of Mathematical and Statistical Sciences
University of Alberta
Edmonton (Alberta)
Canada T6G 2G1
URL: http://www.math.ualberta.ca/Kouritzin_M.html](mailto:[email protected]%0A)
and
Anne MacKay
Anne MacKay
Department of Mathematics
Université du Québec à Montréal
Montreal (Quebec)
Canada H3C 3P8
University of Alberta and UQAM
Abstract.
The use of sequential Monte Carlo within simulation for path-dependent option pricing is proposed and evaluated. Recently, it was shown that explicit solutions and importance sampling are valuable for efficient simulation of spot price and volatility, especially for purposes of path-dependent option pricing. The resulting simulation algorithm is an analog to the weighted particle filtering algorithm that might be improved by resampling or branching. Indeed, some branching algorithms are shown herein to improve pricing performance substantially while some resampling algorithms are shown to be less suitable in certain cases. A historical property is given and explained as the distinguishing feature between the sequential Monte Carlo algorithms that work on path-dependent option pricing and those that do not. In particular, it is recommended to use the so-called effective particle branching algorithm within importance-sampling Monte Carlo methods for path-dependent option pricing. All recommendations are based upon numeric comparison of option pricing problems in the Heston model.
Key words and phrases:
American Options, Sequential Monte Carlo, Branching Processes, Heston Model, Stochastic Approximation.
Partial funding in support of this work was provided by NSERC discovery grants and FRQNT research support for new academics.
1. Introduction
Several years ago, Heston (1993) introduced an asset model that has withstood the test of time and become widely popular. Its popularity stems from the availability of a closed-form solutions for European option prices and the inclusion of stochastic volatility. It is, in a way, the “Black-Scholes model of the stochastic volatility world” and is widely used to model stock, bond and foreign currency prices. Recently, Kouritzin (2018) introduced explicit weak solutions to the two-dimensional stochastic differential equation (SDE) solutions of the Heston model that are easy to simulate.
Let be a complete filtered (risk-neutral) probability space supporting (scalar) independent standard Brownian motions , . Then, Heston (1993) modeled an asset price and its volatility by
[TABLE]
with parameters , and . This model has an explicit solution (for all ) with respect to when for some . Otherwise, one can still produce an explicit solution with respect to a new probability up until (a stopping time when) the volatility drops too low.
Option prices that do not allow closed-form solutions are often priced by Monte Carlo simulation. Using Kouritzin (2018)’s explicit solutions, it is possible to simulate multiple Heston paths efficiently. For reasons that will become apparent in the next paragraphs, we often refer to each of these simulated paths as particles. Each particle has the desired distribution (the one under which we want to price) under the pricing measure , and the associated likelihood (or Radon-Nikodym derivative) L_{t}^{j}=\frac{P}{\widehat{P}^{j}}\big{|}_{\mathcal{F}_{t}} can be seen as its weight. For this reason, the simulation algorithm for option pricing of Kouritzin (2018) can be thought of as the analog of a weighted particle filter in sequential Monte Carlo methods, producing a weighted empirical measure of the form at each time step. Here, denotes Dirac measure and denotes the path of the -particle’s price and volatility over but held constant from on. The purpose of the present paper is to discuss the use of resampling and branching in the option pricing procedure in much the same manner as resampling and branching are used in sequential Monte Carlo methods.
The weighted particle filter (credited to Handschin (1970), Handschin & Mayne (1969)) also produces an object of the form of as an approximation of the unnormalized filter in problems like tracking and model selection. While the application and the exact likelihood form differ, one can think of the overall object in the same way in the context of Monte Carlo Heston simulation. The particle likelihoods provide relative odds that the simulated paths are good representation of the paths of the underlying model. However, there is a well known problem in particle filtering (or sequential Monte Carlo): If left alone, as increases, many of the simulated particles move away from the underlying model , effectively shrinking the particle system to fewer and fewer relevant particles (i.e. particles that are good representations of the model).
The solution in sequential Monte Carlo was to resample or branch the particles regularly in an unbiased manner so as to kill bad particles while replicating the good ones with high probability. The first and still most popular algorithm, called the bootstrap algorithm, introduced by Gordon et al. (1993), simply redistributes the different particles independently back to their last observed locations according to their relative likelihoods. This algorithm was later improved by the residual resampling of Liu & Chen (1998), stratified resampling of Kitagawa, (1996), combined residual-stratified resampling discussed in Douc & Cappé (2005) and systematic resampling of Carpenter et al. (1999). In all cases, enough resampling was done that the weights could be reset to after resampling. A different approach was taken in Kouritzin (2017b). Here, particles are branched individually rather than resampled collectively and branching only occurs when there is sufficient need. Indeed, the cost of rebalancing the system by resampling and branching is the introduction of noise that immediately degrades estimates. Too much resampling or branching is thus undesirable. With this limited branching, weights are kept after branching. Different branching methods, the residual, combined, dynamic and effective particle branching algorithms, were presented and shown to outperform all of the resampled particle algorithms mentioned above on tracking and model selection (see Kouritzin (2017b)). The simplest of these algorithms, the residual algorithm, has also been analyzed in Kouritzin (2017a).
Herein, we focus on two of these new branching algorithms and introduce them in the Heston simulation framework of Kouritzin (2018). Separately, we show empirically that the bootstrap algorithm is less suitable for pricing options that are strongly path-dependent, mainly because its excessive resampling affects the distribution of path estimates. Conversely, we demonstrate easy deployment and effectiveness of the new branching algorithms and recommend the use of the so-called effective particle branching within simulation-based, path-dependent option pricing problems when for any (for there is no need to consider branching or resampling when ). To do so, we focus on pricing path-dependent options and options allowing for early exercise.
Path-dependent options, including American, Asian and swap options, can be effectively priced using simulation and dynamic programming (DP). Paths of the underlying option prices and other relevant market variables (e.g. stochastic volatility) are first simulated up to maturity of the option. Then, the DP algorithms run backwards in time through each simulation (i.e. particle) comparing the payoff resulting from immediate exercise to the discounted future (risk-neutral) expected payout if the option is not exercised. In particular, American (and other continuously-executable) options are approximated discretely, resulting in Bermudan-style options whose optimal exercise time is determined recursively starting at the option maturity date. A key breakthrough in this approach by Longstaff & Schwartz (2001) and others (see Carriere (1996), Tsitsiklis and Van Roy (2001)) was the realization that this conditional expectation of discounted future payout could be estimated using the simulated particle paths and, further, that the (uncomputable) conditional expectations could be approximated by easily-computable projections. Initially, least-squares regression was used in this projection process. However, this requires solving an often-ill-conditioned system of linear equations. Kouritzin (2018) then proposed replacing regression with one-step stochastic approximation (SA), which can allow more efficient, accurate pricing without numerical stability issues, but whose performance is sensitive to the chosen step gain. To reduce the sensitivity of the algorithm to the selection of the step gain, we recommend a modification of the algorithm based on Polyak and Juditsky (1992). Herein, we further develop this Monte Carlo pricing approach, started by Longstaff & Schwartz (2001), by introducing importance sampling into the approach. Finally, we demonstrate the advantage of resampling through the use of branching algorithms over resampling with interacting algorithms such as the multinomial bootstrap.
The rest of this paper is laid out as follows: In the next section, we recall the Heston model’s weak solutions and present the weighted Heston simulation algorithm in a framework suggestive of resampling or branching. In Section 3, we recall the bootstrap algorithm and explain why it is not suitable for American-style option pricing. We also introduce the branching algorithms and insert them into the weighted Heston algorithm to yield new combined and effective particle branching Heston simulators. In Section 4, we insert these branching simulators into the SA/DP option pricing algorithm. Section 5 consists of our empirical results regarding option pricing. Finally, Section 6 contains our conclusions.
2. Explicit and Weighted Heston Algorithms
Heston (1993) developed a closed form solution for European call options in a specific stochastic volatility model. Broadie & Kaya (2006) then came up with an exact simulation method for the Heston model. Both of these methods are based upon characterizing and evolving the distribution of . Later, Kouritzin (2018) showed that the Heston SDEs were actually explicitly solvable in a way that is very convenient for simulation, which facilitates the pricing of exotic options whose prices cannot be written in closed-form. Specifically, if we let
[TABLE]
for some , then the following result is developed in Kouritzin (2018):
Theorem 1**.**
Let ; ; be given random variables with ; and be independent standard Brownian motions on . Let also
[TABLE]
where for with satisfying . Define
[TABLE]
Then, is a stopping time and is a likelihood (i.e. a non-negative -martingale with respect to for any such that for all ). Moreover, are independent standard Brownian motions and
[TABLE]
on with respect to .
If we take in (2.1) (or equivalently in (2.3)), then
[TABLE]
is called the closest explicit Heston model. Then, Theorem 1 provides a means to produce weighted particles of the desired Heston model until the volatility drops below and to resort to the closest explicit model thereafter.
Remark 1** (Notation).**
We are using for solutions to the closest explicit Heston model under , reserving for the solution to (1.1). Henceforth, we will use and .
It is necessary to stop (at ) before the volatility gets too small. Indeed, when the volatility reaches , the closest explicit and general Heston volatility equations become deterministic
[TABLE]
and it is obvious which solution one has. The model distributions are then singular to each other when .
Suppose are i.i.d. copies of , where are as in Theorem 1, and denotes the (continuous) paths of the particle. Then, by the (measure-valued) strong law of large numbers, we find that
[TABLE]
where the right hand side means the distribution of from (2.7). However, the weights will typically be very uneven so the effective number of particles, approximated here by the ratio N_{T}^{eff}\coloneqq\left(\sum\limits_{k=1}^{N}\widetilde{L}_{T}^{k}\right)^{2}\Big{/}\sum\limits_{k=1}^{N}\left(\widetilde{L}_{T}^{k}\right)^{2}, is much smaller than and the convergence rate is slow (in ). We will introduce resampling and branching to prevent this from happening.
3. Sequential Monte Carlo and Branching Heston Algorithm
The results of the previous sections can be applied to the efficient simulation of Heston paths and their associated likelihood, which are then used to weight each path in option price calculations. In order to avoid the case in which a small number of paths have significantly higher likelihoods than the rest, thus increasing the variance of the estimate and reducing the performance of the simulation algorithm, we resort to regular resampling of the particles.
We first recall the bootstrap algorithm, which will later be shown to be less suitable for American option pricing. We then present two branching algorithms that can improve the performance of the Heston simulation procedure of Kouritzin (2018).
3.1. Bootstrap Algorithm
The main idea behind the bootstrap algorithm is to construct a multinomial distribution using the likelihoods associated with each particle and to sample from this distribution. For clarity, in Algorithm 1, we summarize the bootstrap algorithm as it might be used within the simulation of the Heston model stock prices and volatilities.
Remark 2**.**
The step described in line 6 was purposely vague so as not to detract from the bootstrap algorithm. These Heston simulation details appear below and can be inserted in place of this step. However, we argue and show experimentally that this bootstrap algorithm does not perform as well as other resampling algorithms for American-type option pricing and we recommend not using it.
The problem with the bootstrap algorithm in path-dependent option pricing is that there is too much resampling noise. Suppose that we kept track of each particle through all times. When resampling occurs, a “child particle” moves to a “parent” particle site and one can create historical paths by appending the child’s path to its parent’s. If we just focus on discrete times, then it is shown in Del Moral et al. (2001) that for the bootstrap algorithm,
[TABLE]
meaning the limit is not the (discrete-time) process distribution of the desired Heston model but rather product measure of the marginals, i.e. independence. This means that the historical property (2.8) does not hold and estimating the conditional expectation of future discounted payoff can not be determined correctly from the cross-sectional data when the bootstrap is used, which has a significant impact on path-dependent option pricing. If one mistakenly calculated this estimate with bootstrap, he or she would wind up with a regular expectation instead of a conditional one, which is not useful to many option pricing problems. A good way to correct for this is to switch to the branching algorithms in Kouritzin (2017b).
We note that there is a second (non-bootstrap) resampling algorithm in Del Moral et al. (2001) that does have the required historical property. However, this second algorithm is slower and arguably more complicated than the branching algorithms discussed below.
3.2. Branching Algorithms
Compared to collective resampling algorithms such as the bootstrap one presented above, individual branching improves speed by simplifying decisions. The decision to branch (or kill) a particle (that is, the decision to resample the same particle more than once, or not at all) is based upon the weight of that individual particle and the total weight of all particles, and not on the individual values of the other ones. Our basic branching framework is described in Algorithm 2.
The key steps of Algorithm 2 are described between lines 10 and 27. They serve determine the new number of particles and weights in an unbiased manner. The main idea of the branching algorithm is the following. When the prior weight for particle is extreme (that is, outside of a certain interval around the average weight), we do (limited) residual-style branching, which helps to preserve the process distribution. Particles that are branched result in zero or more particles, which are assigned the average weight , are added at the same location as the parent. In other words, we copy (or kill) the path with extreme prior weight and give the copies, if there are any, the current average weight. When its prior weight is not extreme, a particle is not branched and gets to keep its prior weight. The flexibility in this class of algorithms is how we determine “extreme”.
The resampling parameter determines the size of the interval around the average weight outside of which particles are considered extreme and are branched. This parameter can be obtained using different methods (see below) and controls the amount of branching: If , then there is no branching. If , then every particle is branched. Both of these cases are rarely good. Instead, one looks for a good fixed (combined branching) or even makes depend upon the system of particles (effective particle branching).
When it is necessary to record the whole path and not just its final value, we just append a child’s path to that of its parent. It follows that each child at time has a unique parent at time and grandparent but a particle at time might have no or several children at time . Then, the historical paths (or a discretization thereof) can be used for option pricing. Indeed, in Section 5, we price options using
[TABLE]
The precise mathematical proof will resemble that of the strong law of large numbers in Theorem 5.1 in Kouritzin (2017a) but is left to future work. More information on this branching algorithm framework can be found in Kouritzin (2017b).
One key aspect of these branching algorithms discussed in Kouritzin (2017b) is that the number of particles do not vary wildly. Wild particle number variation could affect performance and cause a method to fail. Notice that the average weight in line 6 is normalized by the initial number of particles (rather than the current), which forces the expected number of future particles given the current to be the initial .
We present below two branching options. The second one is a refinement of the first one, and it should perform better. We mention that the performance of the branching algorithms is tied to how well they control their particle numbers, as well as under what circumstances they will branch more, and direct the reader to Kouritzin (2017b) for more details and other branching algorithms.
3.2.1. Combined Branching
This is the basic algorithm used in Algorithm 2. It uses stratified resampling to help control the number of particles and improve algorithm performance. The are negatively correlated, which is desirable for particle control. The generation of a large number of new particles will be more likely to be compensated after by the generation of a smaller number of new particles, because of the negative correlation between .
In lines 10 to 16 of Algorithm 2, we handle the non-branched particles while moving the ones that will be branched to the front of the line. In line 20, we create the required number of stratified uniform random numbers and in lines 23 to 27, we branch the particles designated for branching. We have so we are actually using a residual technique (see Kouritzin (2017b)). However, generating negatively correlated , rather than independent one as in the Residual branching algorithm of Kouritzin (2017b) reduces the probability of getting mostly large or mostly small uniform random numbers. This reduces the variation in and thus in the number of particles.
Remark 3**.**
In combined branching, we use a suitable fixed . The better values are model dependent and our choices are given within each example in Section 5. For most problems, the optimal proportion of paths that are branched at each time step is between and . For a given problem, can be chosen by minimizing the root-mean-square error (RMSE), or another measure of the difference, between the theoretical and estimated values of quantities that are known theoretically, such as the expectation of or . This minimization can be performed using a stochastic gradient descent algorithm. In many cases, the RMSE will not be very sensitive to . When this is the case, one should choose to manage the trade off between adding branching noise now and re-distributing the particles for better future use.
3.2.2. Effective Particle Branching
In the prior method, a constant was used for simplicity only. If one is in need of fast, accurate option pricing, then there are better choices.
The uneven weights that arise from direct use of Theorem 1 result in an effective number of particles, , that can be estimated. In our setting, the estimated number of effective and non-effective particles are:
[TABLE]
From the second equality, we see that if all are the same so all particles are equally effective and if all but one of the were [math] so there is only one effective particle. (None of the can be zero but they can be arbitrarily close.) Otherwise, it gives us a number somewhere in between that can be interpreted as the effective number of particles. It is very reasonable to anticipate better results when branching either more or fewer particles in the situation there are few effective ones. A first intuition might lead us to the conclusion that it is better to branch more in order to obtain more effective particles immediately. However, if those few particles with high weights happen to be wrong, then we will likely be creating a large number of copies of these “bad” paths. We do not assume either a priori but rather, in the effective particle branching algorithm, we set
[TABLE]
for experimentally determined constants , and let the data decide.
3.3. Weighted and Explicit Heston Simulation
We now turn our attention to line 4 in the basic branching algorithm (this step also appears in line 6 of the bootstrap algorithm), in which are evolved to obtain . Here, we recall how to perform this step using Theorem 1 (see also Kouritzin (2018)).
Defining constants
[TABLE]
we find that (2.2,2.4) can be rewritten as
[TABLE]
which simplifies the simulation of the values at a future time step. Indeed, the stochastic integral in (3.3) is conditionally (given ) Gaussian since and are independent so it can be simulated as a centered normal random variable with variance . Even the likelihood (3.4) avoids stochastic integrals.
There are a number of ways to compute the two deterministic integrals in (3.3) and (3.4). Kouritzin (2018) mentions that Simpson’s rule with is a good choice. In general, the choice of should depend on the Heston’s model parameters, more specifically on the ratio , and on the size of the time steps. Indeed, if is low (or in our case, if is close to 2) a larger , i.e. a finer discretization, should be used in order for the simulated ’s not to drop too close to zero. When is higher, numerical tests have shown that the precision of the estimate is not affected by the use of a smaller (see Section 5.1). When this is the case, it is better to use a lower value for to speed up the algorithm. Of course, if the time steps used in the simulation are longer, a larger should be selected to increase the precision of the deterministic integral. However, since one of the advantages of Algorithm 3 is its speed, we recommend using it to price path-dependent options when the financial market needs to be simulated at short time intervals.
For the presentation of the algorithm, it will be notationally convenient to remove the tildes and to define two more constants
[TABLE]
The resulting procedure to simulate paths of the Heston model using time steps of length 1 is given in Algorithm 3.
Remark 4**.**
Algorithm 3 is presented in its most basic form. Its performance can be further improved through the use of antithetic variates for the ’s. Another way to further speed up the algorithm could be to generate a large sample of Normal random variables and to re-use them in such a way to minimize the dependence thus created. The application of this method is left for future work.
4. Pricing by SA/DP Algorithm
We now turn our attention to pricing American-style options in the Heston model. The risk-neutral (pricing) measure we use going forward is the one denoted by in (2.7). In other words, for any , and we must resort to Algorithm 3, along with branching, to simulate paths of for pricing purposes.
We consider an option that can be exercised only once, at any time before or at maturity. The payoff process of the option represents the amount received by the investor given that the option is exercised at , for 111Since we only consider discrete exercise times, we are technically pricing Bermudan options. For ease of notation, we use a subset of to denote the possible exercise times, without loss of generality. We denote this process by and assume that it satisfies either
[TABLE]
for some non-negative measurable function , where is the running average of the underlying asset price. For example, using the notation , we have that for an American put or for an Asian put with early exercise. Recall from (2.7) that denotes the drift of under the risk-neutral measure, and so discounts the payoff from time to [math].
Remark 5**.**
In numerical implementations, the running average price should be recursively calculated from the price by setting and
[TABLE]
for .
The price at of an American-style option with payoff process , denoted here by is defined as
[TABLE]
where denotes the collection of stopping times taking values in and denotes expectation with respect to . By letting be such that and , we have .222For more details on pricing American options, see for example Chapter 21 of Björk (2009).
American option pricing problems therefore involve calculating for different values , which can be done using Monte Carlo simulations and dynamic programming. This so-called least-squares Monte Carlo method has the advantage of being highly flexible (see e.g. Longstaff & Schwartz (2001), Clément et al., (2002)). Here, we recall the main ideas of the method and present an algorithm that makes use of stochastic approximation techniques to increase execution speed and numerical stability (see also Kouritzin (2018)).
As explained below, a key step of the pricing algorithm relies on the projection of the future payoff. Clément et al., (2002) used the general assumptions discussed below to ensure that this projection can be done correctly. We first note that the procedure relies on the underlying model being a Markov chain and on the payoff being of the form . Therefore, when pricing payoffs depending on the running average, it is necessary to add a third state variable, , to our underlying model. We denote by the state space of the Markov process . Then, the following assumptions are required to use projections to approximate .
- •
Total: there are measurable -valued functions on such that is total333A subset of a Hilbert space is total if its span is the entire space. on for all . 444Clément et al., (2002) also assumed functions such that for all . However, one can replace with the filtration generated by and then these functions exist for our payoff process.
- •
Non-singular: is positive definite for all , where .
The point of these conditions is that we need a suitable collection of functions that can be used for projection of a future payoff. Typically, is an ordering of the basis functions . Different choices of basis functions are possible. Longstaff & Schwartz (2001) mention the weighted Laguerre, Hermite, Legendre, Chebyshev, Gegenbauer and Jacobi polynomials. Kouritzin (2018) also discusses using a modification to the Haar polynomials. Both articles use the weighted Laguerre polynomials in numerical examples, which is what we do in Section 5.
Theoretically, suitable stopping times can be obtained by working backwards from :
[TABLE]
Remark 6**.**
Typically, so and do not affect the recursion in (4.2).
In practice, the conditional expectations and stopping times are not immediately computable. Therefore, we follow Longstaff & Schwartz (2001) and approximate the conditional expectations by projections onto the closed linear span of (see also Clément et al., (2002), Kouritzin (2018)), which leads to the following approximate stopping times, which depends on the number of projection functions used
[TABLE]
The problem is still not solved because this projection, , is computed using coefficients that are not known. However, Longstaff & Schwartz (2001) showed that these coefficients can be estimated using cross-sections of a Monte Carlo particle system via linear regression. Kouritzin (2018) then showed that stochastic approximation can be a favorable alternative to the original least-squares approach. Neither of these works contemplated using the method in a branching particle system, which is what we do here.
In addition, to improve the robustness and the performance of the algorithm, we use the average of the coefficients in the projection. Averaging the coefficients was first proposed by Polyak and Juditsky (1992) and is now a widely used method that can speed up the convergence of the stochastic approximation algorithm. We recall our branching particle system and apply the stochastic approximation algorithm to our setting next.
Hereafter, suppose is as in Section 3.2 so (3.1) holds, where satisfies (2.7) and or , with . Moreover, define the particle-by-particle stopping times
[TABLE]
and suppose
[TABLE]
satisfy a.s. and a.s.555 The algorithms were designed so that these weakly-dependent SLLN’s hold. However, there is future mathematical work to be done along the lines of Kouritzin (2017a) to obtain theoretical convergence results., where
[TABLE]
and
[TABLE]
with \frac{d\widetilde{P}}{dP}\bigg{|}_{\mathcal{F}_{t}}=L_{t}. Now, suppose is defined by and , and by
[TABLE]
for . Then, as explained in Kouritzin (2018) a.s. for any and
[TABLE]
which with help of Clément et al., (2002) yields
[TABLE]
and gives a means to estimate the option value as well as optimal execution . This procedure is described in Algorithm 4.
Algorithm 4 can be improved by using the average of the coefficients to approximate . This modification is done by initializing for all , by adding the following line after Line 11:
[TABLE]
and by calculating the projection in line 15 using . Generally, the exponent in the gain step is equal to 1 when there is no averaging. However, the best value for is generally below with averaging. With the right parameters, averaging the coefficients speeds up convergence of the algorithm. Another important advantage of this modification is that it significantly increases the robustness of the algorithm with respect to the parameters and ; the new algorithm is thus easier to use in practice because convergence is achieved for wider intervals of parameters.
There is still is a potential problem since the projection coefficients depend upon the stopping times and the stopping times depend upon the projection coefficients. Fortunately, the dependence can be decoupled: depends upon and depends upon , which means that we have to work backwards, use the fact and compute prior to . This is reflected in the following algorithm, in which are the chosen basis functions and is a positive constant.
5. Numerical Results
In this section, we illustrate how branching can be used to increase the accuracy and efficiency of Algorithm 3. This algorithm was shown in Kouritzin (2018) to be an interesting tool to price path-dependent options in the Heston model. However, a disadvantage of the simulation method is that it relies on likelihood, or weights, whose variance increase in time, possibly reducing the accuracy of Monte Carlo price estimators.
Here we consider three different parameter sets to identify cases where branching clearly improves the performance of the weighted simulation algorithm. The parameters considered are given in Table 1.777The parameters were chosen to illustrate the efficiency of the branching algorithms rather than for their fit to a specific set of financial data. We show that, for a given number of simulations, branching is often very effective in increasing the precision of price estimates and in decreasing their variance.
To perform the simulations, we discretize the time horizon using time steps per year. We first consider both Asian and European straddle options with maturity , thus using for European options and for Asian options.
In all cases, the use of the explicit weak solution of Theorem 1 necessitates the simulation of weights since is not an integer.
We first price options without early exercise to highlight the effect of branching on the efficiency of the simulation algorithm. We then price American-type options using the algorithm of Section 4 and show that when re-sampling is needed, branching algorithms should be used instead of the bootstrap algorithm.
5.1. Efficiency of the Simulation Algorithms
In this section, we price financial options using the branching algorithms introduced in Section 3.2 in combination with the weighted simulation method (Algorithm 3). The Monte Carlo price estimates are calculated using the weighted simulated price paths and we assume that no early exercise is allowed. In other words, the price estimate of an option with payoff is given by
[TABLE]
Simulations are obtained using Algorithm 3 with Simpson’s rule with time steps per year and sub-intervals (see Table 2 for specific values) to compute the deterministic integrals. The branching parameters (for combined branching) and , (for effective branching), and the stopping parameter were chosen in the following way:
- (1)
Find minimizing . 2. (2)
Using , find (for combined branching) or , (for effective branching) minimizing .
Using the known quantities and allows for efficient calibration of the branching parameters. We note however that, depending on the problem, the objective function is not smooth, since it is based on simulations, and often almost flat around its minimum, which makes it more difficult to identify optimal parameters. However, it is possible to identify intervals of good parameters that can be used in the algorithms.
We note that in the cases of PS2 and PS3, is close to 2, which is the threshold for the Feller condition. Thus, while the Feller condition ensuring that does not hit 0 is satisfied in all cases, we expect to drop much closer to 0 when using PS2 and PS3, compared to PS1. Since the explicit solution underlying the weighted simulation algorithm only holds until gets arbitrarily close to 0, parameter sets that keep away from 0 should be “easier” to simulate from. This explains why for the second and the third parameter set, we need smaller time sub-intervals to simulate (that is, ) and a higher . Indeed, setting helps catch the few paths where the volatility drops too much, causing the weights to explode. When this happens, the weight is set to 0; the particle is effectively dropped from the Monte Carlo estimate, inducing a bias, which we show to be small.
5.1.1. Accuracy of the Price Estimate
There exists situations where branching is crucial to obtaining a fast and accurate pricing algorithm. We illustrate this situation by pricing a European straddle option (for which a closed form is available) using the market parameters of Table 1 and the simulation parameters of Table 2. That is, we calculate the Monte Carlo estimator of the option price using (5.1).
To test the accuracy of the various pricing algorithms, we use the relative RMSE
[TABLE]
where is the real option price calculated using the integral form available for European option prices in the Heston model (see Heston (1993) and Albrecher et al. (2007)). We compute the relative RMSE for simulations.
Figure 1 shows that, for the second and third parameter sets (PS2 and PS3), the relative RMSE of the price estimate is significantly reduced by branching. Branching is however less necessary for the first parameter set (PS1), for which the weighted algorithm 3 alone results in very precise estimates.
As mentioned above, the branching parameters , and were chosen to optimize the performance of the pricing algorithm. Depending on the parameters used, the resulting optimal proportion of branched particles vary. This proportion is much lower for PS1 (under 20%) than for PS2 and PS3 (between 30% and 35%).
The choice of , the number of intervals used to approximate the numerical integrals, depends on the parameters. Table 3 shows the sensitivity of the RMSE of the price estimate to changes in . To isolate the impact of on the RMSE, we perform all calculations without resampling the particles. The results obtained with PS1 are already very precise with ; increasing it does not reduce the RMSE, but slows down the algorithm. This is not true for PS2 and PS3; increasing has a significant impact of the RMSE. We note that in the case of PS2, using could lead to more precise results, while also further increasing the execution time. We also observe that for a fixed , the algorithm is much slower for PS1, since it requires the simulation of Gaussian processes (compared to for PS2 and PS3). However, using allows the run time to remain low.
5.1.2. Performance of the Branching Algorithms
We have shown that for a given number of initial particles, branching can increase the accuracy of the price estimate. This additional step can however slow down the pricing algorithm. In this section, we consider the standard deviation of the price estimate as a function of time. In addition to European options, we also price Asian options with fixed strike , for which there does not exist a closed-form expression. Because of its speed, the weighted simulation algorithm is particularly well-suited for path-dependent options (see also Kouritzin (2018)).
To compare the performance of the algorithms, we consider the standard deviation of the estimate relative to the true option price, which we approximate by computing the price estimate 50 times. The true option price is computed using the semi-analytical formula in the case of the European option. The mean of 50 price estimates obtained using and the weighted algorithm with effective particle branching is considered to be the “true” price of the Asian option.
There are cases in which branching clearly increases the performance of the pricing algorithm. This is true for PS2 and PS3; it can be seen in Figures 2 and 3 that the standard deviation of the price estimator can be reduced much faster when branching is used. That is, although branching does slow down the pricing procedure, the resulting improvement is such that fewer initial particles are needed for the standard deviation of the price estimate to reach a level similar to the one reached without branching for a larger number of initial particles. For example, in the case of PS2 (see Figure 2(B)), the European price estimate using either branching algorithms and has a smaller standard deviation than the non-branched estimate obtained with five times more simulated paths. In this case, pricing with branching is more than 3 times faster. Significant improvements due to branching are also observed for the third parameter set and when pricing Asian options (see Figure 3).
In the case of PS1, the algorithm is already very precise without branching; additional resampling does not seem to be necessary. However, it does not significantly increase the RMSE or the standard deviation of the European option price estimate. Branching does however increase the standard deviation of the Asian option price estimate. This is most likely due to the path-dependent nature of the Asian option payoff; branching can lead to path degeneracy (the surviving particles having few distinct ancestors), which impacts the variance of the price estimate. Although path degeneracy does not appear be a significant problem in our numerical examples in general, the very high performance of the weighted simulation algorithm (without resampling) for PS1 leads to the variance of the price estimate increasing when resampling is added.
The need for resampling is linked to the size of the ratio . Indeed, this ratio is much higher for PS1 (); we could say that the Feller condition is “completely satisfied” and stays well away from 0. Further numerical tests also show that the variance of the likelihood is much lower than in the case of the other two parameter sets. PS2 and PS3 have much smaller ratios ; they are equal to 2.65 and 2.50, respectively. For such parameters, has a higher probability of dropping closer to 0, which increases the variance of the likelihoods and creates the need for more branching.
Our numerical results show that in general, when Algorithm 3 is used, a branching algorithm should be implemented alongside. Effective branching, of which combined branching is a special case, should be prioritized. When necessary, depending on the market parameters considered, branching can be turned off by setting and high enough such that no particle is branched. Improvements due to branching are most significant when the variance of the weights is high; for example, when is close to 2 (as with PS2 and PS3) or when the option to price has a longer maturity, since the variance of the particle system tends to increase with time.
Remark 7**.**
In our setting, the performance of the combined branching algorithm is similar to the one of the effective particle branching one. This is likely due to the fact that, in the cases we study, the number of effective particles remains relatively constant as increases. Thus, using both algorithms results in resampling a similar proportion of the particles at each time step, whether is kept constant (combined branching) or not (effective particle branching). In situations where the distribution of the weights can vary widely from one time step to the other (in particle filtering, for example), the increased flexibility provided by the effective particle branching algorithm can result in a significant performance improvement (see Kouritzin (2017b)). Since it is similar to the combined branching algorithm in terms of complexity of implementation and computation ressources, we still recommend the use of the effective particle branching algorithm, even in cases where its advantage is less obvious.
5.2. Comparison with Bootstrap Algorithm
In this section, we highlight an important advantage of the branching algorithms of Section 3.2 compared to the bootstrap algorithm of Section 3.1: the preservation of the process distribution. To do so, we price Asian options with early exercise, whose price strongly depends on the distribution of the historical paths. We focus on the second and the third sets of parameters (PS2 and PS3) since they are the ones for which the performance of the pricing algorithm is the most improved by resampling.
To price the options, we use the SA/DP algorithm exposed in Section 4 using the first weighted Laguerre polynomials as basis functions for each of volatility, price and average price. We also use the averaged coefficients in order to increase the robustness of the algorithm to the choice of parameters and . In our setting, compared to the well-known least-squares Monte Carlo algorithm, SA/DP allows us to use a large number of basis functions without sacrificing numerical stability and at a lower computing cost than the least-square Monte Carlo algorithm. This leads to more precise price estimates (see Kouritzin (2018)).
As explained in Section 3, the amount of noise introduced by bootstrap resampling can affect the historical law of the particle system, which is problematic when pricing options with early exercise via recursive conditional expectations determined from cross-sectional data, as in Algorithm 4. We price at-the-money Asian call options with fixed strike and early exercise (that is, we consider Bermudan-style options that can be exercised at every discrete time step). To do so, we use Algorithm 4 with and .
We consider options with one year to maturity and use the same market and simulation parameters as in Section 5.1. Figures 4 and 5(A) present the prices obtained using two resampling methods: effective branching and bootstrap resampling (see Algorithm 1). In each case, the empirical distribution presented is estimated using 50 price estimates. These results are compared to a “ground truth price” of 7.67 for PS2 and 6.89 for PS3, indicated by the red dashed line in the figures. These prices were computed without resampling, but with a large number of simulations () and basis functions (), to obtain a very precise result.
Table 4 in the Appendix contains the SA/DP parameters used for all calculations. These parameters control the speed at which the gain step goes to 0 as increases in the stochastic approximation algorithm. In our case, we do not want the gain steps to decrease too fast (that is, should not be too small and should not be too large), as we want each of the paths to have some impact on the estimates . In addition, since we use averaged coefficients to estimate the conditional expectations, we choose , as recommended in Section 4. Nonetheless, if is too large, or if is too small, then we may not average out the noise fast enough and can run out of particles before getting to a good estimate for . In Kouritzin (2018), a stochastic search method was used to find reasonable and . However, the averaging method that we employ herein is far less sensitive to particular choice of and . Within the interval of values for which the gain step decreased at a satisfying rate, we ran the algorithm for different values of and and chose those that minimized for .
Figures 4 and 5 clearly show that branching algorithms should be preferred over multinomial bootstrap when resampling is necessary to price American-type options. Indeed, the prices obtained with effective particle branching are in general closer to the true price than when bootstrap is used. For the second parameter set (PS2), prices obtained with resampling also show a clear convergence towards the true price as increases, while such a convergence is not observed when using bootstrap resampling. The prices obtained with the third parameter set (PS3) are not as clear in terms of convergence; a larger number of simulations is needed to reduce the bias of the price estimate.
It has been shown theoretically in Del Moral et al. (2001) that bootstrap resampling kills the process distribution of the simulated paths as the number of simulations gets very large. Our results support this empirically; the price estimates obtained using bootstrap do not converge as clearly to the true price, and the variance of the estimator is in all cases significantly higher than with effective particle branching. For this reason, branching algorithms, and effective branching in particular, should be preferred when importance sampling is used for pricing options with early exercise. Resampling algorithms based on particle branching are also typically faster than bootstrap algorithms, since less resampling occurs.
6. Conclusions and Future Work
Herein, we show that the simulation method presented by Kouritzin (2018) can be improved by resampling, especially when is close to 2. Since bootstrap resampling affects the process distribution of the particle system, the branching algorithms of Kouritzin (2017b), also presented in Section 3.2 in the context of option pricing, are a better alternative. In particular, effective particle branching offers more flexibility and allows the level of resampling to be a function of the state of the system. We also highlight that the use of stochastic approximation algorithms within the dynamic programming framework for American option pricing is still valid when historical price paths are obtained via branching resampling algorithms. Our use of the average coefficients in the pricing algorithm increases the robustness of the algorithm to the choice of parameters and . This is an important improvement over the algorithm used by Kouritzin (2018), as it increases the stability of the pricing method and makes it easier to use, since less work is needed to find the optimal parameters.
Future work will explore more advanced ways for the algorithms to “learn” the optimal resampling parameters. Additional improvements linked to recycling simulated random variables in order to speed up the simulation algorithm will also be considered.
Appendix A Additional Information on Numerical Results
Table 4 contains the stochastic approximation parameters used to obtain the numerical results of Section 5.2.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Albrecher et al. (2007) H. Albrecher, P. Mayer, W. Schoutens & J. Tistaert (2007) The little Heston trap. Wilmott , 1 , 83–92.
- 2Björk (2009) T. Björk (2009) Arbitrage theory in continuous time Oxford University press.
- 3Broadie & Kaya (2006) M. Broadie & O. Kaya (2006) Exact simulation of stochastic volatility and other affine jump diffusion processes. Operation Research 54 (2), 217–231.
- 4Carpenter et al. (1999) J. Carpenter, P. Clifford & P. Fearnhead (1999) Improved particle filter for nonlinear problems. IEE Proceedings-Radar, Sonar and Navigation , 146 (1), 2–7.
- 5Carriere (1996) J. F. Carriere (1996) Valuation of the early-exercise price for options using simulations and nonparametric regression. Insurance: mathematics and Economics 19 (1), 19–30.
- 6Clément et al., (2002) E. Clément, D. Lamberton & P. Protter (2002) An analysis of a least squares regression method for American option pricing. Finance and Stochastics , 6 (4), 449–471.
- 7Del Moral et al. (2001) P. Del Moral, M. Kouritzin, & L. Miclo (2001) On a class of discrete generation interacting particle systems. Electronic Journal of Probability , 6 .
- 8Douc & Cappé (2005) R. Douc & O. Cappé (2005) Comparison of resampling schemes for particle filtering. In: Image and Signal Processing and Analysis, 2005. ISPA 2005. Proceedings of the 4th International Symposium on , 64–69, IEEE.
