Asynchronous Communications Library for the Parallel-in-Time Solution of Black-Scholes Equation
Qinmeng Zou, Guillaume Gbikpi-Benissan, Frederic Magoules

TL;DR
This paper presents an implementation of asynchronous parallel algorithms for solving the Black-Scholes equation, demonstrating improved efficiency and resource management through the use of the JACK2 library.
Contribution
It introduces an implementation of asynchronous parareal algorithms using JACK2, addressing resource management and convergence detection challenges.
Findings
Asynchronous algorithms improve computational efficiency.
JACK2 effectively manages resources in parallel computations.
Experimental results confirm the method's effectiveness.
Abstract
The advent of asynchronous iterative scheme gives high efficiency to numerical computations. However, it is generally difficult to handle the problems of resource management and convergence detection. This paper uses JACK2, an asynchronous communication kernel library for iterative algorithms, to implement both classical and asynchronous parareal algorithms, especially the latter. We illustrate the measures whereby one can tackle the problems above elegantly for the time-dependent case. Finally, experiments are presented to prove the availability and efficiency of such application.
| Time | |||||
|---|---|---|---|---|---|
| 0.05 | 23.9476 | 23.9426 | 0.0050 | 0.0002 | 0.547 |
| 0.15 | 31.5512 | 31.5477 | 0.0035 | 0.0001 | 1.663 |
| 0.25 | 37.7225 | 37.7192 | 0.0033 | 0.0001 | 2.881 |
| 0.35 | 42.9995 | 42.9960 | 0.0035 | 0.0001 | 3.830 |
| 0.45 | 47.6375 | 47.6339 | 0.0036 | 0.0001 | 4.814 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Asynchronous Communications Library for the Parallel-in-Time Solution of Black-Scholes Equation
Qinmeng Zou CentraleSupélec, Université Paris-Saclay, France.
Guillaume Gbikpi-Benissan CentraleSupélec, Université Paris-Saclay, France.
Frédéric Magoulès CentraleSupélec, Université Paris-Saclay, France (correspondence, [email protected]).
Abstract
The advent of asynchronous iterative scheme gives high efficiency to numerical computations. However, it is generally difficult to handle the problems of resource management and convergence detection. This paper uses JACK2, an asynchronous communication kernel library for iterative algorithms, to implement both classical and asynchronous parareal algorithms, especially the latter. We illustrate the measures whereby one can tackle the problems above elegantly for the time-dependent case. Finally, experiments are presented to prove the availability and efficiency of such application.
Keywords: asynchronous iterative algorithms; parallel computing; parareal method; time domain decomposition; option pricing
1 Introduction
The parareal algorithm is a time domain decomposition method aiming at solving PDEs over a splitting of the original domain into multiple subdomains, over which the equation could be solved independently. The algorithm is therefore suited for parallel programmin by distributing the computational cost over several processors. The parallelization with respect to the time variable was earlier attempted by Nievergelt [18] and by Miranker and Liniger [17] for the ordinary differential equations. The parareal algorithm is another extension of decomposition methods in the time direction, first introduced in [10] that has grown in popularity as it has been successfully applied to a wide variety of problems. The algorithm consists of two solvers with different time steps, the coarse and the fine, to produce a convergent iterative method for parallel computations. The coarse solver solves the equation sequentially on the coarse time step while the fine solver uses the results from the coarse solution to solve, in parallel, over the fine time steps. The solution can therefore be obtained from independent time subdomain, over which the algorithm will iteratively use a coarse and a fine solver to compute the solution on the entire domain. Several extensions have been proposed following the theoretical analysis. Bal and Maday in [6] slightly modified the algorithm, and proposed an analysis of the convergence properties of the algorithm for parabolic equations.
On the other hand, the asynchronous iterative scheme appeared on the horizon since Chazan and Miranker proposed their pioneering ideas applied to the linear systems [8], leading to a prosperity to the high performance applications, such as sub-structuring methods [16] and optimized Schwarz methods [15]. We are interested in the asynchronous iterations of the parareal method. Compared to the classical parareal algorithm, the asynchronous version gives further attractive properties including high efficiency and flexibility. Nevertheless, the implementation of the asynchronous iterative methods involves huge amount of work besides the theoretical analysis, which leads to a frustrating down time for the programming activities. Fortunately, some libraries have been developed to fill this gap. JACE [5] is a multi-threaded library aiming at the execution of asynchronous iterative algorithms based on Java, followed by a centralized volatility tolerant extension named JaceV [4]. Recently, they have developed some P2P and decentralized versions to enlarge the applicable scope [3, 7]. Additionally, CRAC [9] is another library designed to build the asynchronous applications based on C++. Finally, JACK [14] is an MPI-based C++ library, which provides many advanced properties designed for parallel iterative algorithms. A new version called JACK2 has been released. It gives an improved user-friendly API and several new convergence detection methods with global residual computation.
In this paper we concentrate on the implementation of asynchronous parareal solver based on JACK2 library. Section 2 recalls the mathematical and computational framework of the parareal algorithm. In Section 3, we presents the implementation details of such methods by JACK2. In Section 4, we choose the option pricing models to illustrate the experimental results of our asynchronous solver. Finally, our conclusions and remarks follow in Section 5.
2 Mathematical and Computational Framework
Consider a time-dependent problem
[TABLE]
where is a second-order linear elliptic operator. After a decomposition of the time domain, we arrive at
[TABLE]
where the time domain is defined in a series of , with , . It is easily seen that the problem (1) is also an exact model. Nevertheless, we often choose an accurate enough sub-interval to approximate the original problem as shown in Figure 1.
We note here that the limit of each subproblem satisfies
[TABLE]
which enables us to execute a continuous computation. A well known model proposed in [10] can be described as the following
[TABLE]
where is a coarse propagator, and is a fine propagator. Accordingly, represents a coarse time step, whereas is a fine time step. We notice that this is a synchronous parareal iterative scheme, over which one can come up with the algorithmic model forthwith
Algorithm 1*.*
(Classical Parareal Algorithm)
rank of current processor
for to do
end for
while not convergence do
wait for the update of from processor
send to processor as
end while
Evidently, the first processor has only successor, whereas the last one has none but predecessor.
We now turn to the asynchronous mode. Formally, we view parareal as a two-stage problem which can be formalized carefully using the asynchronous scheme, illustrated as below
[TABLE]
where and , with the conditions , . We assume that each processor keeps updating rather than resting permanently. Furthermore, the elements used by this processor will be renewed from time to time. Similarly, such computational model can be written as
Algorithm 2*.*
(Asynchronous Parareal Algorithm)
rank of current processor
for to do
end for
while not convergence do
if detect from processor then
update
end if
send to processor as
end while
It is seen that Algorithm 2 has no waiting process in the internal loop, whereas Algorithm 1 requires a latency time, which might cause a waste of time on coordination, but gain through the number of iterations.
3 Implementations
3.1 Preprocessing
The parareal algorithm is indeed a special case of domain decomposition methods, since each time frame only depends upon its predecessor, and essentially needed by its successor. We separate the neighbors into the outgoing links and the incoming links, and thus write the code as Listing 1. It is seen that sneighb_rank and rneighb_rank have respectively one element in the communication graph. Take note that the first processor has no predecessor, therefore ; while the last one has no successor, obviously, with .
The key idea behind this project is that there exist two levels over the computation process. The upper object is the instance of Parareal, which provides the interface of JACK2’s power, whereas the lower level is the PDESolver, used to solve the equations. We notice that it is easy to enlarge the number of solvers without changing the code of Parareal. Accordingly, we need two instances of PDESolver in Parareal to obtain respectively the coarse results and the fine results, illustrated in Listing 2.
coarse_vec_U is the solution of coarse_pde, while fine_vec_U is the solution of fine_pde. vec_U is the final solution, which is also the output of Parareal solver. Finally, vec_U0 is the incoming value for both coarse_pde and fine_pde.
3.2 Overview of the Iterations
We now present the iterative process of parareal algorithm. Notice that the beginning parts of Algorithm 1 and Algorithm 2 are the same, we can therefore write in a unified scheme in Listing 3.
Let us mention here that Listing 3 gives us the exact position of vec_U0 and the initial estimation of . Nevertheless, we need to distinguish the synchronous and the asynchronous parareal algorithm afterwards.
3.2.1 Synchronous mode
We follow the scheme of Algorithm 1 except for the convergence detection mode, since there is a rule of thumb to lighten the communication. It is easily seen that the processor will stop updating after iteration. Hence, we can apply a supplementary condition to the judging area in the synchronous mode, with the final code shown as Listing 4.
We note that it is necessary to copy the values of recv_buf to vec_U0 and the values of vec_U to send_buf whenever needed. However, we can also ease the process by setting directly the two vectors as communication buffers. In practice, the choice relates to the mathematical library used throughout the project. Moreover, using the results of Listing 4, it is also shown in Listing 5 that we need to finalize the fine solution and wait for the global termination.
3.2.2 Asynchronous mode
The asynchronous parareal scheme can be implemented similarly with Listing 4 except for the matching conditions. There is no more facile code in practice, and we present the process as Listing 6. From the entry of lconv_flag, JACK2 can interact with the asynchronous process and give back results as res_norm.
4 Experimental Results
In this section, we give the experiments for the asynchronous parareal scheme, based on the Black-scholes equation, which is a well-known time-dependent equation in the domain of option pricing. Consider the following problem
[TABLE]
where is the option price, depending on stock price and time . Volatility and risk-free interest rate are the constant parameters. Then, we employ some trivial variable substitutions to obtain the heat equation
[TABLE]
where is the time to maturity, with initial and boundary conditions
[TABLE]
where , , . Obviously, equation (2) can be applied to the classical and the asynchronous parareal scheme. We note here that the input variables are the current stock price and the exercise price . Finally, we broaden the spatial domain and let the dimension correspond to the precision of our problem.
The mathematical operations are supported by Alinea library [11], which is implemented in C++ for both central processing unit and graphic processing unit devices. It includes several linear algebra operations [1] and numerical linear algebra solvers [13], [2], [12].
The experiments are exercised on a SGI ICE X cluster connected with InfiniBand. Each node consists of two Intel Xeon E5-2670 v3 2.30 GHz CPUs with SGI-MPI 2.14 installed. We assume that , , with 250 sub-intervals. Given , , , several average results are reported in Table 1 with Approximate Option Prices , Exact Option Prices , Absolute Error and Relative Error .
It is seen that the asynchronous parareal algorithm implemented by JACK2 is efficient and accurate.
5 Conclusions
In this paper, we illustrated the implementation of synchronous and asynchronous parareal algorithm using JACK2, an asynchronous communication kernel library for iterative algorithms. We discussed the particularity of such time-dependent problem, which leads to a rather different configuration than the general parallel context. As expected, the experimental results at the end verified the correctness of the asynchronous scheme.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] A.-K. C. Ahamed and F. Magoulès. Fast sparse matrix-vector multiplication on graphics processing unit for finite element analysis. In 14th IEEE Int. Conf. on High Performance Computing and Communications, Liverpool, UK, June 25-27, 2012 . IEEE, 2012.
- 2[2] A.-K. C. Ahamed and F. Magoulès. Iterative methods for sparse linear systems on graphics processing unit. In 14th IEEE Int. Conf. on High Performance Computing and Communications, Liverpool, UK, June 25-27, 2012 . IEEE, 2012.
- 3[3] J. M. Bahi, R. Couturier, and P. Vuillemin. Jace P 2P: an environment for asynchronous computations on peer-to-peer networks. In Proc. of 2006 IEEE Int. Conf. on Cluster Computing , pages 1–10, 2006.
- 4[4] J. M. Bahi, R. Couturier, and P. Vuillemin. Jace V: A programming and execution environment for asynchronous iterative computations on volatile nodes. In Proc. of 7th Int. Conf. VECPAR, Rio de Janeiro, Brazil, June 10-13, 2006 , pages 79–92. Springer, 2007.
- 5[5] J. M. Bahi, S. Domas, and K. Mazouzi. Jace: a Java environment for distributed asynchronous iterative computations. In Proc. of 12th Euromicro Conf. on Para. Dist. and Network-Based Proc., 2004 , pages 350–357, 2004.
- 6[6] G. Bal and Y. Maday. A “parareal” time discretization for non-linear pde’s with application to the pricing of an american put. In L. F. Pavarino and A. Toselli, editors, Recent Developments in Domain Decomposition Methods , pages 189–202. Springer, 2002.
- 7[7] J.-C. Charr, R. Couturier, and D. Laiymani. JACEP 2P-V 2: A fully decentralized and fault tolerant environment for executing parallel iterative asynchronous applications on volatile distributed architectures. In Proc. of 4th Int. Conf. GPC 2009, Geneva, Switzerland, May 4-8, 2009 , pages 446–458. Springer, 2009.
- 8[8] D. Chazan and W. Miranker. Chaotic relaxation. Linear Algebra and its Applications , 2(2):199–222, 1969.
