Thinning and Multilevel Monte Carlo for Piecewise Deterministic (Markov) Processes. Application to a stochastic Morris-Lecar model
Vincent Lemaire (LPSM UMR 8001), Mich\`ele Thieullen (LPSM UMR 8001),, Nicolas Thomas (LPSM UMR 8001)

TL;DR
This paper develops thinning-based approximation and multilevel Monte Carlo methods for Piecewise Deterministic Processes, demonstrating improved efficiency in simulating a stochastic Morris-Lecar neuron model.
Contribution
It introduces a novel MLMC approach using thinning for PDPs and applies it to a complex neuron model, showing enhanced simulation performance.
Findings
MLMC with thinning outperforms classical Monte Carlo in simulations
Strong and weak error estimates for PDPs and PDMPs are established
Application to a 2D Morris-Lecar model demonstrates practical benefits
Abstract
In the first part of this paper we study approximations of trajectories of Piecewise Deter-ministic Processes (PDP) when the flow is not explicit by the thinning method. We also establish a strong error estimate for PDPs as well as a weak error expansion for Piecewise Deterministic Markov Processes (PDMP). These estimates are the building blocks of the Multilevel Monte Carlo (MLMC) method which we study in the second part. The coupling required by MLMC is based on the thinning procedure. In the third part we apply these results to a 2-dimensional Morris-Lecar model with stochastic ion channels. In the range of our simulations the MLMC estimator outperforms the classical Monte Carlo one.
| , | |||
|---|---|---|---|
|
|
|||
| time (sec) | cost | |||||||
|---|---|---|---|---|---|---|---|---|
| 1 | 5.00e-01 | 4.32e-01 | 2.34e-01 | 1.52e-01 | 3.10e-01 | 2.16e+03 | 6.30e-02 | 3.43e+04 |
| 2 | 2.50e-01 | 2.59e-01 | 1.69e-01 | 3.87e-02 | 1.55e+00 | 8.47e+03 | 3.15e-02 | 2.69e+05 |
| 3 | 1.25e-01 | 1.17e-01 | 6.25e-02 | 9.78e-03 | 8.80e+00 | 3.34e+04 | 1.58e-02 | 2.12e+06 |
| 4 | 6.25e-02 | 5.67e-02 | 2.73e-02 | 2.47e-03 | 5.62e+01 | 1.32e+05 | 7.88e-03 | 1.68e+07 |
| 5 | 3.12e-02 | 2.50e-02 | -1.78e-03 | 6.21e-04 | 3.93e+02 | 5.24e+05 | 3.94e-03 | 1.33e+08 |
| time (sec) | cost | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 5.00e-01 | 3.89e-01 | 1.14e-01 | 1.38e-01 | 3.62e-01 | 2 | 2 | 0.1 | 2.60e+03 | 2.82e+04 |
| 2 | 2.50e-01 | 2.29e-01 | 1.19e-01 | 3.83e-02 | 1.44e+00 | 2 | 4 | 0.1 | 1.04e+04 | 1.16e+05 |
| 3 | 1.25e-01 | 1.21e-01 | 6.24e-02 | 1.07e-02 | 5.76e+00 | 2 | 7 | 0.1 | 4.22e+04 | 4.85e+05 |
| 4 | 6.25e-02 | 5.91e-02 | 1.38e-02 | 3.30e-03 | 2.69e+01 | 3 | 4 | 0.1 | 1.90e+05 | 2.37e+06 |
| 5 | 3.12e-02 | 3.47e-02 | -1.39e-02 | 1.01e-03 | 1.08e+02 | 3 | 6 | 0.1 | 7.71e+05 | 9.99e+06 |
| time (sec) | cost | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 5.00e-01 | 4.28e-01 | 1.98e-01 | 1.44e-01 | 3.13e-01 | 2 | 2 | 0.1 | 2.38e+03 | 2.50e+04 |
| 2 | 2.50e-01 | 2.47e-01 | 1.55e-01 | 3.72e-02 | 1.26e+00 | 2 | 3 | 0.1 | 9.46e+03 | 1.00e+05 |
| 3 | 1.25e-01 | 1.36e-01 | 8.90e-02 | 1.05e-02 | 5.00e+00 | 2 | 6 | 0.1 | 3.80e+04 | 4.11e+05 |
| 4 | 6.25e-02 | 6.22e-02 | 2.15e-02 | 3.41e-03 | 2.09e+01 | 3 | 4 | 0.1 | 1.58e+05 | 1.75e+06 |
| 5 | 3.12e-02 | 3.17e-02 | 6.07e-03 | 9.71e-04 | 8.35e+01 | 3 | 5 | 0.1 | 6.30e+05 | 7.02e+06 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProbability and Risk Models · Stochastic processes and financial applications · Markov Chains and Monte Carlo Methods
Thinning and Multilevel Monte Carlo for Piecewise Deterministic (Markov) Processes.
Application to a stochastic Morris-Lecar model.
Vincent Lemaire [email protected] Michèle Thieullen11footnotemark: 1 [email protected] Nicolas Thomas11footnotemark: 1 [email protected] Laboratoire de Probabilités, Statistique et Modélisation (LPSM), UMR CNRS 8001, Sorbonne Université-Campus Pierre et Marie Curie, Case 158, 4 place Jussieu, F-75252 Paris Cedex 5, France
Abstract
In the first part of this paper we study approximations of trajectories of Piecewise Deterministic Processes (PDP) when the flow is not explicit by the thinning method. We also establish a strong error estimate for PDPs as well as a weak error expansion for Piecewise Deterministic Markov Processes (PDMP). These estimates are the building blocks of the Multilevel Monte Carlo (MLMC) method which we study in the second part. The coupling required by MLMC is based on the thinning procedure. In the third part we apply these results to a 2-dimensional Morris-Lecar model with stochastic ion channels. In the range of our simulations the MLMC estimator outperforms the classical Monte Carlo one.
Keywords: Piecewise Deterministic (Markov) Processes, Multilevel Monte Carlo, Thinning, Strong error estimate, Weak error expansion, Morris-Lecar model.
Mathematics Subject Classification: 65C05, 65C20, 60G55, 60J25, 68U20
1 Introduction
In this paper we are interested in the approximation of the trajectories of PDPs. We establish strong error estimates for a PDP and a weak error expansion for a PDMP. Then we study the application of the Multilevel Monte Carlo (MLMC) method in order to approximate expectations of functional of PDMPs. Our motivation comes from Neuroscience where the whole class of stochastic conductance-based neuron models can be interpreted as PDMPs. The response of a neuron to a stimulus, called neural coding, is considered as a relevant information to understand the functional properties of such excitable cells. Thus many quantities of interest such as mean first spike latency, mean interspike intervals and mean firing rate can be modelled as expectations of functionals of PDMPs.
PDPs have been introduced by Davis in [5] as a general class of stochastic processes characterized by a deterministic evolution between two successive random times. In the case where the deterministic evolution part follows a family of Ordinary Differential Equations (ODEs) the corresponding PDP enjoys the Markov property and is called a PDMP. The distribution of a PDMP is thus determined by three parameters called the characteristics of the PDMP: a family of vector fields, a jump rate (intensity function) and a transition measure.
We consider first a general PDP which is not necessarily Markov on a finite time interval for which the flow is not explicitly solvable. Approximating its flows by the classical Euler scheme and using our previous work [22], we build a thinning algorithm which provides us with an exact simulation of an approximation of that we denote . The process is a PDP constructed by thinning of a homogeneous Poisson process which enjoys explicitly solvable flows.
Actually this thinning construction provides a whole family of approximations indexed by the time step of the Euler scheme. We prove that for any real valued smooth function the following strong estimate holds
[TABLE]
Moreover if is a PDMP the following weak error expansion holds
[TABLE]
The estimate (1) is mainly based on the construction of the couple and on the fact that the Euler scheme is of order 1 this is why it is valid for a general PDP and its Euler scheme. On the contrary, the estimate (2) relies on properties which are specific to PDMPs such as the Feynman-Kac formula.
The MLMC method relies simultaneously on estimates (1) and (2) that is why we study its application to the PDMP framework instead of the more general PDP one. MLMC extends the classical Monte Carlo (MC) method which is a very general approach to estimate expectations using stochastic simulations. The complexity (i.e the number of operations necessary in the simulation) associated to a MC estimation can be prohibitive especially when the complexity of an individual random sample is very high. MLMC relies on repeated independent random samplings taken on different levels of accuracy which differs from the classical MC method. MLMC can then greatly reduces the complexity of the classical MC by performing most simulations with low accuracy but with low complexity and only few simulations with high accuracy at high complexity. MLMC have been introduced by S. Heinrich in [18] and developed by M. Giles in [12]. The MLMC estimator has been efficiently used in various fields of numerical probability such as SDEs [12], Markov chains [1], [2], [14], Lévy processes [10], jump diffusions [28], [7], [8] or nested Monte Carlo [21], [13]. See [11] for more references. To the best of our knowledge, application of MLMC to PDMPs has not been considered.
For the sake of clarity, we describe here the general improvement of MLMC. We are interested in the estimation of where is a real valued square integrable random variable on a probability space . When can be simulated exactly the classical MC estimator with independent random variables identically distributed as , provides an unbiased estimator. The associated L2 - error satisfies . If we quantify the precision by the L2 - error, then a user-prescribed precision is achieved for so that in this case the global complexity is of order .
Assume now that cannot be simulated exactly (or cannot be simulated at a reasonable cost) and that we can build a family of real valued random variables on which converges weakly and strongly to as in the following sense
[TABLE]
and
[TABLE]
Assume moreover that for the random variable can be simulated at a reasonable complexity (the complexity increases as ). The classical MC estimator now consists in a sequence of random variables
[TABLE]
where are independent random variables identically distributed as . The bias and the variance of the estimator (5) are respectively given by and . From the strong estimate (4) we have that as so that is asymptotically a constant independent of . If as above we quantify the precision by the L2 - error and use that , we obtain that the estimator (5) achieves a user-prescribed precision for and so that the global complexity of the estimator is now .
The MLMC method takes advantage of the estimate (4) in order to reduce the global complexity. Let us fix and consider for a geometrically decreasing sequence where for fixed and . The indexes are called the levels of the MLMC and the complexity of increases as the level increases. Thanks to the weak expansion (3), the quantity approximates . Using the linearity of the expectation the quantity can be decomposed over the levels as follows
[TABLE]
For each level , a classical MC estimator is used to approximate and . At each level, a number of samples are required and the key point is that the random variables and are assumed to be correlated in order to make the variance of small. Considering at each level independent couples of correlated random variables, the MLMC estimator then reads
[TABLE]
where is a sequence of independent and identically distributed random variables distributed as and for are independent sequences of independent copies of and independent of . It is known, see [12] or [21], that given a precision and provided that the family satisfies the strong and weak error estimates (4) and (3), the multilevel estimator (7) achieves a precision with a global complexity of order if , if and if . This complexity result shows the importance of the parameter . Finally, let us mention that in the case it possible to build an unbiased multilevel estimator, see [15].
Estimates (1) and (2) suggest to investigate the use of the MLMC method in the PDMP framework with and . Letting and for and a smooth function, we define a MLMC estimator of just as in (7) (noted in the paper) where the processes involved at the level are correlated by thinning. Since these processes are constructed using two different time steps, the probability of accepting a proposed jump time differs from one process to the other. Moreover the discrete components of the post-jump locations may also be different. This results in the presence of the term in the estimate (1). In order to improve the convergence rate (to increase the parameter ) in (1), we show that for a given PDMP we have the following auxiliary representation
[TABLE]
The PDMP and its Euler scheme are such that their discrete components jump at the same times and in the same state. is a process which depends on . The representation (8) is inspired by the change of probability introduced in [28] and is actually valid for a general PDP (Proposition 2.2) so that where is the Euler scheme corresponding to and is a process which depends on . Letting and we define a second MLMC estimator (noted ) where now the discrete components of the Euler schemes involved at the level always jump in the same states and at the same times. To sum up, the first MLMC estimator we consider () derives from (6) where the corrective term at level is whereas the corrective term of the second estimator () is . For readability, we no longer write the dependence of the approximations on the time step. For the processes and ) we show the following strong estimate
[TABLE]
so that we end up with and the complexity goes from a to a .
As an application we consider the PDMP version of the 2-dimensional Morris-Lecar model, see [25], which takes into account the precise description of the ionic channels and in which the flows are not explicit. Let us mention [3] for the application of quantitative bounds for the long time behavior of PDMPs to a stochastic 3-dimensional Morris-Lecar model. The original deterministic Morris-Lecar model has been introduced in [23] to account for various oscillating states in the barnacle giant muscle fiber. Because of its low dimension, this model is among the favourite conductance-based models in computational Neuroscience. Furthermore, this model is particularly interesting because it reproduces some of the main features of excitable cells response such as the shape, amplitude and threshold of the action potential, the refractory period. We compare the classical MC and the MLMC estimators on the 2-dimensional stochastic Morris-Lecar model to estimate the mean value of the membrane potential at fixed time. It turns out that in the range of our simulations the MLMC estimator outperforms the MC one. It suggests that MLMC estimators can be used successfully in the framework of PDMPs.
As mentioned above, the quantities of interest such as mean first spike latency, mean interspike intervals and mean firing rate can be modelled as expectations of path-dependent functional of PDMPs. This setting can then be considered as a natural extension of this work.
The paper is organised as follows. In section 2, we construct a general PDP by thinning and we give a representation of its distribution in term of the thinning data (Proposition 1). In section 3, we establish strong error estimates (Theorems 1-2). In section 4, we establish a weak error expansion (Theorem 3). In section 5, we compare the efficiency of the classical and the multilevel Monte Carlo estimators on the 2-dimensional stochastic Morris-Lecar model.
2 Piecewise Deterministic Process by thinning
2.1 Construction
In this section we introduce the setting and recall some results on the thinning method from our previous paper [22]. Let where is a finite or countable set and . A piecewise deterministic process (PDP) is defined from the following characteristics
- •
a family of functions such that for all ,
- •
a measurable function ,
- •
a transition measure .
We denote by a generic element of . We only consider PDPs with continuous -component so that for and , we write
[TABLE]
If we write , then it holds that
[TABLE]
Our results do not depend on the dimension of the variable in so we restrict ourself to () for the readability. We work under the following assumption
Assumption 2.1**.**
There exists such that, for all , .
In [22] we considered a general upper bound . In the present paper is constant (see Assumption 2.1). Let be a probability space on which we define
an homogeneous Poisson process with intensity (given in Assumption 2.1) whose successive jump times are denoted . We set . 2. 2.
two sequences of iid random variables with uniform distribution on , and independent of each other and independent of .
Given we construct iteratively the sequence of jump times and post-jump locations of the -valued PDP that we want to obtain in the end using its characteristics . Let be fixed and let . We construct by thinning of , that is
[TABLE]
where
[TABLE]
We denote by the cardinal of (which may be infinite) and we set . For we introduce the functions defined on by
[TABLE]
By convention, we set . We also introduce the function defined by
[TABLE]
For all , is the inverse of the cumulative distribution function of (see for example [9]). Then, we construct from the uniform random variable and the function as follows
[TABLE]
Thus, the distribution of given is or in view of (9),
[TABLE]
For , assume that is constructed. Then, we construct by thinning of conditionally to , that is
[TABLE]
where
[TABLE]
Then, we construct using the uniform random variable and the function as follows
[TABLE]
We define the PDP for all from the process by
[TABLE]
Thus, and . We also define the counting process associated to the jump times .
2.2 Approximation of a PDP
In applications we may not know explicitly the functions . In this case, we use a numerical scheme approximating . In this paper, we consider schemes such that there exits positive constants and independent of and such that
[TABLE]
To the family we can associate a PDP constructed as above that we denote . We emphasize that there is a positive probability that and jump at different times and/or in different states even if they are both constructed from the same data , and . However if the characteristics of a PDP are such that and depend only on , that is and for all , then its embedded Markov chain is such that is an autonomous Markov chain with kernel and is a counting process with intensity . In particular, both and do not depend on . The particular form of the characteristics and implies that the PDP and its approximation are correlated via the same process . In other words, these processes always jump exactly at the same times and their -component always jump in the same states. Such processes are easier theoretically as well as numerically than the general case. They will be useful for us in the sequel.
The following lemma (which is important for several proofs below) gives a direct consequence of the estimate (14).
Lemma 2.1**.**
Let and satisfying (14). Let be an increasing sequence of non-negative real numbers with and let be a sequence of -valued components. For a given let us define iteratively the sequences and as follows
[TABLE]
Then, for all we have
[TABLE]
where and are positive constants independent of .
Proof of Lemma 2.1.
Let . From the estimate (14), we have for all
[TABLE]
and therefore
[TABLE]
By summing up these inequalities for and since we obtain
[TABLE]
∎
2.3 Application to the construction of a PDMP and its associated Euler scheme
In this section we define a PDMP and its associated Euler scheme from the construction of the section 2.1. For all , we consider a family of vector fields satisfying
Assumption 2.2**.**
For all , the function is bounded and Lipschitz with constant independent of .
If we choose in the above construction where for all , we denote by the unique solution of the ordinary differential equation (ODE)
[TABLE]
then the corresponding PDP is Markov since satisfies the semi-group property which reads for all and for all . In this case, the process is a piecewise deterministic Markov process (see [6] or [20]).
Let . We approximate the solution of (15) by the Euler scheme with time step . First, we define the Euler subdivision of with time step , noted , by .
Then, for all , we define the sequence , the classical Euler scheme, iteratively by
[TABLE]
to emphasize its dependence on the initial condition. Finally, for all , we set
[TABLE]
We construct the approximating process as follows. Its continuous component starts from at time 0 and follows the flow until the first jump time that we construct by (10) and (11) of section 2.1 where we replace by . At time the continuous component of is equal to since there is no jump in the continuous component. The discrete component jumps to . We iterate this procedure with the new flow until the next jump time given by (10) and (11) with and so on. We proceed by iteration to construct on .
Consequently, the discretisation grid for on the interval is random and is formed by the points for and . This differs from the SDE case where the classical grid is fixed.
By classical results of numerical analysis (see [17] for example), the continuous Euler scheme (16) (also called Euler polygon) satisfies estimate (14). If we choose in the above construction then the corresponding PDP is not Markov since the functions do not satisfy the semi-group property (see [20]).
2.4 Thinning representation for the marginal distribution of a PDP
The sequence is an -valued Markov chain with respect to its natural filtration and with kernel defined by
[TABLE]
For , the law of the random variable given admits the density given for by
[TABLE]
Classically the marginal distribution of is expressed using (13), the intensity via (18) and the kernel (see (17)). Indeed for fixed and for any bounded measurable function we can write,
[TABLE]
where and times, that is
[TABLE]
However since we have constructed by thinning, we would prefer to express the distribution of using the upper bound , the Poisson process and the sequences , .
Proposition 2.1**.**
Let be a PDP with characteristics constructed in section 2.1 and let . Then
[TABLE]
The following proposition and its corollaries will be useful in section 3. In their statements and are PDPs constructed in section 2.1 using the same data , , and the same initial point but with different sets of characteristics.
The following results are inspired by the change of probability introduced in [28] where the authors are interested in the application of the MLMC to jump-diffusion SDEs with state-dependent intensity. In our case, we need a change of probability which guarantees not only that the processes jump at the same times but also in the same states.
Proposition 2.2**.**
Let us denote by ( resp. ) the characteristics of (resp. ). Let us assume that and depend only on , that is always positive and for all . For all integer , let us define on the event
[TABLE]
the product being equal to if and for all
[TABLE]
Then, for all we have
[TABLE]
Corollary 2.1**.**
Under the assumptions of Proposition 2.2, setting , we have
[TABLE]
Remark 2.1**.**
Proposition 2.2 looks like a Girsanov theorem (see [26]) however we do not use the martingale theory here.
Remark 2.2**.**
We have chosen to state Proposition 2.2 with a PDP whose intensity and transition measure only depend on for readability purposes. Actually the arguments of the proof are valid for non homogeneous intensity and transition measure of the form and for . A possible choice of such characteristics is and for a given function. This remark will be implemented in section 5.4.
Corollary 2.2**.**
Let (resp. ) be the set of characteristics of (resp. . We assume that is always positive and that for all . Let be the sequence defined by and for . For all integer , let us define on the event ,
[TABLE]
the products being equal to if and for all
[TABLE]
Then, for all we have
[TABLE]
Proof of Proposition 2.1.
It holds that . Then
[TABLE]
The set is equivalent to the following
-
,
-
among the times exactly are accepted by the thinning method they are the , all the others are rejected.
We proceed by induction starting from the fact that all the are rejected which corresponds to the event
[TABLE]
The random variable depends on where by construction , which implies that depend on . Thus is independent of all the other random variables of thinning that are present in . The conditional expectation of w.r.t. the vector is therefore an expectation indexed by this vector as parameters. Since the law of is for all we obtain for ,
[TABLE]
with
[TABLE]
In (19) the random variables are independent of the vector . Conditioning by this vector we obtain
[TABLE]
We can iterate on the latter form by first conditioning by all the other r.v. and then conditioning by all the remaining ones and so on. However the terms that appear do not have the same structure since the correspond to a rejection for whereas corresponds to an acceptation. So that the next step yields
[TABLE]
where we write for simplicity keeping in mind that .
Moreover the previous arguments apply to and provide
[TABLE]
∎
We prove below Proposition 2.2. The other statements can be proved analogously.
Proof of Proposition 2.2.
By assumption the (jump) characteristics of depend only on . Let . Applying the same arguments as in (21) to and using the definitions of and we obtain,
[TABLE]
We iterate the previous argument based on the use of (21) and we use the definition of to obtain
[TABLE]
where for short and . Comparing the latter expression to (20) and using an induction we conclude that
[TABLE]
It remains to sum up on and . ∎
3 Strong error estimates
In this section we are interested in strong error estimates. Below, we state the main assumptions and theorems of this section, the proofs are given in sections 3.2, 3.3 respectively.
Assumption 3.1**.**
For all and for all , the functions and are Lipschitz with constants , respectively independent of .
Theorem 3.1**.**
Let and satisfying (14) and let and be the corresponding PDPs constructed in section 2.1 with for some . Assume that is finite and that and satisfy Assumption 3.1. Then, for all bounded functions such that for all the function is -Lipschitz where is positive and independent of , there exists constants and independent of the time step such that
[TABLE]
Remark 3.1**.**
When the numerical scheme is of order , which means we have .
Assumption 3.2**.**
There exist positive constants , , such that for all , and .
Theorem 3.2**.**
Let and satisfying (14) and let and be the corresponding PDPs constructed in section 2.1 with for some . Let and be defined as in Corollary 2.1. Under assumptions 3.1 and 3.2 and for all bounded functions such that for all the function is -Lipschitz (), there exists a positive constant independent of the time step such that
[TABLE]
where has been defined in Corollary 2.1.
We now introduce the random variable which will play an important role in the strong error estimate of Theorem 3.1 as well as in the identification of the coefficient in the weak error expansion in section 4 (see the proof of Theorem 4.1 in section 4.2).
Definition 3.1**.**
Let us define .
The random variable enables us to partition the trajectories of the couple in a sense that we precise now. Consider the event
[TABLE]
where and denote the sequences of jump times of and . On this event the trajectories of the discrete time processes and are equal for all such that (or equivalently ). Moreover the complement i.e contains the trajectories for which and differ on (there exits such that or ).
3.1 Preliminary lemmas
In this section we start with two lemmas which will be useful to prove Theorems 3.2 and 3.3.
Lemma 3.1**.**
Let be a finite set. We denote by the cardinal of and for we denote by its elements. Let and be two probabilities on . Let and for all . By convention, we set . Let and be two -valued random variables defined by
[TABLE]
where , and for all . Then, we have
[TABLE]
Proof of Lemma 3.1.
By definition of and and since the intervals are disjoints for , we have
[TABLE]
Moreover, for all , we have
[TABLE]
Thus, denoting by the positive part of and using that , we obtain
[TABLE]
Adding and subtracting in the the above sum yields
[TABLE]
The first sum above is a telescopic sum. Since and , we have .
∎
Lemma 3.2**.**
Let and be two real-valued sequences. For all , we have
[TABLE]
Proof of Lemma 3.2.
By induction. ∎
3.2 Proof of Theorem 3.1
First, we write
[TABLE]
where is defined in Definition 3.1. The order of the term is the order of the probability that the discrete processes and differ on . The order of the term is given by the order of the Euler scheme squared because the discrete processes and are equal on . In the following we prove that and that .
Step 1: estimation of . The function being bounded we have where . Moreover, for , . Hence
[TABLE]
where
[TABLE]
We start with . First note that, for , and that on the event , we have , so that We emphasize that it makes no difference in the rest of the proof if we choose . Since , we can rewrite as follows
[TABLE]
By construction we have and . The random variable depends on the vector which is independent of . Conditioning by this vector in (24) and applying Lemma 3.1 yields
[TABLE]
From the definition of (see (12)), the triangle inequality and since is -Lipschitz, we have . Since we are on the event , the application of Lemma 2.1 yields . Thus where is a constant independent of . Moreover, and so that . From the definition of (see (23)), we can write
[TABLE]
The second equality above follows since and . We only treat the term , the term can be treated similarly by interchanging the role of and . Just as in the previous case, we can rewrite as follows
[TABLE]
In (25) we have . The random variable depends on which is independent of . Conditioning by this vector in (25) yields
[TABLE]
Using the Lipschitz continuity of then Lemma 2.1 we get that where is a constant independent of . Concerning the term , we will end with the estimate . We conclude in the same way as in the estimation of above that .
Step 2: estimation of . Note that for we have where we can interchange the role of and . Thus, using the partition , we have
[TABLE]
The application of the Lipschitz continuity of and of Lemma 2.1 yields
[TABLE]
Then, we have where is a constant independent of . Since , we conclude that .
∎
3.3 Proof of Theorem 3.2
First we reorder the terms in . We write where
[TABLE]
Likewise we reorder the terms in writing where and are defined as (26) and (27) replacing and by and . Since the processes and do not depend on or , the term is the same in and . To prove Theorem 3.2, let us decompose the problem and write
[TABLE]
so that
[TABLE]
In the following we show that and that .
Step 1: estimation of . The function being bounded we have where is a positive constant. Moreover, for all , we have and . Thus, and using the definition of and (see (26), (27) and (28)) we can write
[TABLE]
We set and . To provide the desired estimate for , we proceed as follows. First, we work by to determine (random) bounds for and from which we deduce a (random) bound for . Finally, we take the expectation. We start with . For all and for all we have, from Assumption 2.1, that and . Then, using Lemma 3.2 (twice) we have
[TABLE]
Using the Lipschitz continuity of and Lemma 2.1, we find that, for all and ,
[TABLE]
Moreover, for all we have so that where is a positive constant independent of . Finally, since we have
[TABLE]
Now, consider . Note that from Assumption 2.1 we have . We use the same type of arguments as for . That is, we successively use Lemma 3.2, the Lipschitz continuity of and Lemma 2.1 to obtain
[TABLE]
where is a positive constant independent of . Then, we derive from the previous estimates (29) and (30) that
[TABLE]
where and . Finally, we have . Since we conclude that .
Step 2: estimation of . Recall that and . Then, using the Lipschitz continuity of , Lemma 2.1 and since we get
[TABLE]
Moreover, so that where is a positive constant independent of and . Since we conclude that .
∎
4 Weak error expansion
In this section we are interested in a weak error expansion for the PDMP of section 2.3 and its associated Euler scheme . First of all, we recall from [5] that the generator of the process which acts on functions defined on is given by
[TABLE]
where for notational convenience we have set , and for all . Below, we state the assumptions and the main theorem of this section. Its proof which is inspired by [27] (see also [24] or [16]) is delayed in section 4.2.
Assumption 4.1**.**
For all and for all , the functions , and are bounded and twice continuously differentiable with bounded derivatives.
Assumption 4.2**.**
The solution of the integro differential equation
[TABLE]
with a bounded function and given by (31) is such that for all , the function is bounded and two times differentiable with bounded derivatives. Moreover the second derivatives of are uniformly Lipschitz in .
Theorem 4.1**.**
Let be a PDMP and its approximation constructed in section 2.3 with for some . Under assumptions 4.1. and 4.2. for any bounded function there exists a constant independent of such that
[TABLE]
Remark 4.1**.**
If is a PDMP whose characteristics satisfy the assumptions of Proposition 2.2 and is its approximation we deduce from Theorem 4.1 that
[TABLE]
4.1 Further results on PDMPs: Itô and Feynman-Kac formulas
Definition 4.1**.**
Let us define the following operators which act on functions defined on .
[TABLE]
[TABLE]
From Definition 4.1, the generator defined by (31) reads . We introduce the random counting measure associated to the PDMP defined by for and for . The compensator of , noted , is given from [5] by
[TABLE]
Hence, is a martingale with respect to the filtration generated by noted . Similarly, we introduce , , and to be the same objects as above but corresponding to the approximation . The fact that is the compensator of and that is a martingale derives from arguments of the marked point processes theory, see [4].
Definition 4.2**.**
Let us define the following operators which act on functions defined on .
[TABLE]
[TABLE]
Remark 4.2**.**
For all functions defined on , , so that .
The next theorem provides Itô formulas for the PDMP and its approximation . For all , we set if for some and for some .
Theorem 4.2**.**
Let and be a PDMP and its approximation respectively constructed in section 2.3 with for some . For all bounded functions continuously differentiable with bounded derivatives, we have
[TABLE]
where is a true -martingale, and
[TABLE]
where, is a true -martingale.
Proof of Theorem 4.2.
The proof of (35) is given in [5]. We prove (36) following the same arguments. Since , we have
[TABLE]
Consider the above sum. As in [5], we write, on the event , that
[TABLE]
For all , we decompose the increment as a sum of increments on the intervals . Without loss of generality we are led to consider increments of the form for some , and for all where we recall that is defined by (16). The function is smooth enough to write
[TABLE]
Then, the above arguments together with definition 4.2 yields
[TABLE]
∎
The following theorem gives us a way to represent the solution of the integro-differential equation (32) as the conditional expected value of a functional of the terminal value of the PDMP . It plays a key role in the proof of Theorem 4.1.
Theorem 4.3** (PDMP’s Feynman-Kac formula [6]).**
Let be a bounded function. Then the integro-differential equation (32) has a unique solution given by
[TABLE]
4.2 Proof of Theorem 4.1
We provide a proof in two steps. First, we give an appropriate representation of the weak error . Then, we use this representation to identify the coefficient in (33).
Step 1: Representing . Let denote the solution of (32). From Theorem 4.3 we can write . Then, the application of the Itô formula (36) to at time yields
[TABLE]
Since is a true martingale, we obtain
[TABLE]
For we have (see Definition 4.2). From the regularity of , and (see assumptions 4.1 and 4.2), the functions , and are smooth enough to apply the Itô formula (36) between and respectively. This yields
[TABLE]
[TABLE]
[TABLE]
Moreover, since for , we have
[TABLE]
so that
[TABLE]
where,
[TABLE]
Since , the first term in the above equality is 0 by Theorem 4.3. By using Fubini’s theorem and the fact that and are true martingales, we obtain
[TABLE]
Moreover, since is a -martingale, we have
[TABLE]
Collecting the previous results, we obtain We can compute an explicit form of in term of , , , and their derivatives. Indeed, is given by (37), and we have
[TABLE]
The application of the Taylor formula to the functions , , , , , , and at the order 0 around yields Setting and recalling that for , and that , we obtain
[TABLE]
Consider the expectation in the right-hand side of the above equality. We decompose the integral into a (finite) sum of integrals on the intervals where is constant. Without loss of generality, we are led to consider integrals of the form for some , and a bounded constant. We have moreover adding and subtracting in the numerator of yields
[TABLE]
Since is bounded we deduce that . Since is assumed bounded and , the above arguments yields the following representation
[TABLE]
Step 2: From the representation (38) to the expansion at the order one. In this step, we show that . First, we introduce the random variables and defined by and and write
[TABLE]
where is defined in Definition 3.1. Since is bounded and (see the proof of Theorem 3.1), we have Now, recall from (22) that, on the event , we have and for all such that . Thus, for all and for all we have and . Consequently, on the event we have
[TABLE]
From the regularity assumptions 4.1 and 4.2, the function is uniformly Lipschitz in with constant as sum and product of bounded Lipschitz functions. Thus, from this Lipschitz property and the application of Lemma 2.1, we get
[TABLE]
From the above inequality, we find that . Since and we conclude that . We have shown that . Secondly, from the regularity assumptions 4.1 and 4.2, the function is uniformly Lipschitz in . Moreover, for all there exits such that both and belong to the same interval so that and . Thus, from the Lipschitz continuity of , from the fact that and since is uniformly bounded in we have where is a constant independent of . Then, we obtain from which we deduce that Finally, the weak error expansion reads
[TABLE]
∎
5 Numerical experiment
In this section, we use the theoretical results above to apply the MLMC method to the PDMP 2-dimensional Morris-Lecar (shortened PDMP 2d-ML).
5.1 The PDMP 2-dimensional Morris-Lecar
The deterministic Morris-Lecar model has been introduced in 1981 by Catherine Morris and Harold Lecar in [23] to explain the dynamics of the barnacle muscle fiber. This model belongs to the family of conductance-based models (just as the Hodgkin-Huxley model [19]) and takes the following form
[TABLE]
where , , , , .
In this section we consider the PDMP version of (39) that we denote by , , whose characteristics are given by
- •
f(\theta,\nu)=\frac{1}{C}\Big{(}I-g_{\text{Leak}}(\nu-V_{\text{Leak}})-g_{\text{Ca}}M_{\infty}(\nu)(\nu-V_{\text{Ca}})-g_{\text{K}}\frac{\theta}{N_{\text{K}}}(\nu-V_{\text{K}})\Big{)},
- •
,
- •
Q\Big{(}(\theta,\nu),\{\theta+1\}\Big{)}=\frac{(N_{\text{K}}-\theta)\alpha_{\text{K}}(\nu)}{\lambda(\theta,\nu)}, Q\Big{(}(\theta,\nu),\{\theta-1\}\Big{)}=\frac{\theta\beta_{\text{K}}(\nu)}{\lambda(\theta,\nu)}.
The state space of the model is where stands for the number of potassium gates. The values of the parameters used in the simulations are , , , , , , , , , , , , , .
5.2 Classical and Multilevel Monte Carlo estimators
In this section we introduce the classical and multilevel Monte Carlo estimators in order to estimate the quantity where is the PDMP 2d-ML and for so that gives the value of the membrane potential at time . Note that other possible choices are or for some . In those cases, the quantity gives the moments of the membrane potential or the number of open gates at time so that we can compute statistics on these biological variables.
Let . In the sequel it will be convenient to emphasize the dependence of the Euler scheme on a time step . We introduce a family of random variables defined by where for a given the corresponding PDP is constructed as in section 2.3 with time step . In particular, the processes for are correlated through the same randomness , and . We build a classical Monte Carlo estimator of based on the family as follows
[TABLE]
where is an i.i.d sequence of random variables distributed like . The parameters and have to be determined. We build a multilevel Monte Carlo estimator based on the family as follows
[TABLE]
where for are independent sequences of independent copies of the couple and independent of the i.i.d sequence . The parameter is a free parameter that we fix in section 5.4. The parameters , , and with have to be determined, then we set , .
We also set where is defined as in Proposition 2.2 with an intensity and a kernel that will be specified in section 5.4 and let be such that for all . By Proposition 2.2, we have and for . Consequently, we build likewise a multilevel estimator based on the family .
The complexity of the classical Monte Carlo estimator depends on the parameters and the one of the multilevel estimators and depends on . In order to compare those estimators we proceed as in [21] (see also [24]), that is to say, for each estimator we determine the parameters which minimize the global complexity (or cost) subject to the constraint that the resulting L2-error must be lower than a prescribed .
As in [21], we call , , , and the structural parameters associated to the family and . We know theoretically from Theorem 3.1 (strong estimate) and Theorem 4.1 (weak expansion) that whereas , and are not explicit (we explain how we estimate them in section 5.3). Moreover, the structural parameters , , , and associated to and are such that , (see (34)), (see Theorem 3.2) and , are not explicit.
The classical and the multilevel estimators defined above are linear and of Monte Carlo type in the sense described in [21]. The optimal parameters of those estimators are then expressed in term of the corresponding structural parameters as follows (see [21] or [24]). For a user prescribed , the classical Monte Carlo parameters and are
[TABLE]
where . The parameters of the estimator are given in Table 1 where for with the convention . The parameters of are given in a similar way using , and . Finally, the parameter is determined as in [21] section 5.1.
5.3 Methodology
We compare the classical and the multilevel Monte Carlo estimators in term of precision, CPU-time and complexity. The precision of an estimator is defined by the L2-error also known as the Root Mean Square Error (RMSE). The CPU-time represents the time needed to compute one realisation of an estimator. The complexity is defined as the number of time steps involved in the simulation of an estimator. Let denote the estimator (40) or (41). We estimate the bias of by
[TABLE]
where are independent replications of the estimator. We estimate the variance of by
[TABLE]
where are independent replications of the empirical variance of . In the case where is the crude Monte Carlo estimator we set
[TABLE]
If is the MLMC estimator, we set
[TABLE]
where and for , . Then, we define the empirical RMSE by
[TABLE]
The numerical computation of (43) for both estimators (40) and (41) requires the computation of the optimal parameters given by (42) and in table 1 of section 5.2 which are expressed in term of the structural parameters , and . Moreover the computation of the bias requires the value . Since there is no closed formula for the mean and variance of we estimate them using a crude Monte Carlo estimator with and . The constants and are not explicit, we use the same estimator of as in [21] section , that is
[TABLE]
and we use the following estimator of
[TABLE]
The estimator of is obtained writing the weak error expansion for the two time steps and , summing and neglecting the term. In (44) we use and in (45), we use and the expectations are estimated using a classical Monte Carlo of size on . We emphasize that we interested in the order of and so that we do not need a precise estimation here.
5.4 Numerical results
In this section we first illustrate the results of Theorems 3.1 and 3.2 on the Morris-Lecar PDMP, then we compare the MC and MLMC estimators. The simulations were carried out on a computer with a processor Intel Core i5-4300U CPU @ 1.90GHz 4. The code is written in C++ language. We implement the estimator (see section 5.2) for the following choices of the parameters .
Case 1: and \tilde{Q}\Big{(}\theta,\{\theta+1\}\Big{)}=\frac{N_{\text{K}}-\theta}{N_{\text{K}}}, \tilde{Q}\Big{(}\theta,\{\theta-1\}\Big{)}=\frac{\theta}{N_{\text{K}}}.
Case 2: and where denotes the first component of the solution of (39).
Cases 1 and 2 correspond to the application of Proposition 2.2. Based on Corollary 2.2 we also consider the following case.
Case 3: Consider the quantity where and are PDPs with characteristics and respectively. By Corollary 2.2, we have where is a PDP whose discrete component jumps in the same states and at the times as the discrete component of do and is the corresponding corrective process. Thus, we consider the quantity instead of .
The case 3 implies to use the following MLMC estimator which is slightly different from (41).
[TABLE]
where for are independent sequences of independent copies of the couple where is a PDP whose discrete component jumps in the same states and at the same times as the Euler scheme with time step do, whose deterministic motions are given by the approximate flows with time step and is the corresponding corrective process (see Corollary 2.2).
The figure 2 confirms numerically that and that for the cases 1,2 and 3 (see Theorems 3.1 and 3.2 respectively). Indeed, for (see figure 2(a)), we observe that the curve corresponding to the decay of as increases is approximately parallel to a line of slope -1 and that the curves corresponding to the decay of in the cases 1,2 and 3 are parallel to a line of slope -2. We also see that the curves corresponding to the cases 2 and 3 are approximately similar and that for some value of those curves go below the one corresponding to . The curve corresponding to the case 1 is always above all the other ones, this indicates that the L2-error (or the variance) in the case 1 is too big (w.r.t the others) and that is why we do do not consider this case in the sequel. As increases (see figures 2(b) and 2(c)), the theoretical order of the numerical schemes is still observed. However, for , a slight difference begin to emerge between the cases 2 and 3 (the case 3 being better) and this difference is accentuated for so that we do not represent the case 2.
For the Monte Carlo simulations we set , and the time step involved in the first level of the MLMC is set to . We choose this value for because it represents (on average) the size of an interval of two successive jump times of the auxiliary Poisson process . The estimation of the true value and variance leads and Var. Note that where is the deterministic membrane potential solution of (39) so that there is an offset between the deterministic potential and the mean of the stochastic potential. We replicate 100 times the simulation of the classical and multilevel estimators to compute the empirical RMSE so that in (43).
The results of the Monte Carlo simulations are shown in tables 2 for the classical Monte Carlo estimator and in tables 3 and 4 for the multilevel estimators and (case 3). As an example, the first line of table 3 reads as follows: for a user prescribed , the MLMC estimator is implemented with levels, the time step at the first level is , this time step is refined by a factor with at each levels and the sample size is . For such parameters, the numerical complexity of the estimator is , the empirical RMSE and the computational time of one realisation of is seconds. We also reported the empirical bias and the empirical variance in view of (43).
The results indicate that the MLMC outperforms the classical MC. More precisely, for small values of (i.e ) the complexity and the CPU-time of the classical and the multilevel MC estimators are of the same order. As decreases (i.e as increases) the difference in complexity and CPU-time between classical and multilevel MC increases. Indeed, for the complexity of the estimator is approximately 13 times superior to the one of and 19 times superior to the one of . The same fact appears when we look at the complexity ratio of the estimators and (i.e Cost()/Cost()) as decreases. However, the difference between the complexity of these two MLMC estimators increases more slowly than the one between a MC and a MLMC estimator. Recall that the computational benefit of the MLMC over the MC grows as the prescribed decreases.
Both classical and multilevel estimators provide an empirical RMSE which is close to the prescribed precision (see tables 2, 3 and 4). We can conclude that the choice of the parameters is well adapted. For the readability, figures 3(a), 3(b) show the ratios of the complexities and the CPU-times of the three estimators , and as a function of .
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] D.F. Anderson and D.J. Higham. Multilevel Monte Carlo for continuous time Markov chains, with applications in biochemical kinetics. Multiscale Model. Simul. , 10(1):146–179, 2012.
- 2[2] D.F. Anderson, D.J. Higham, and Y. Sun. Complexity of Multilevel Monte Carlo tau-leaping. SIAM Journal on Numerical Analysis , 52(6):3106–3127, 2014.
- 3[3] M. Benaïm, S. Le Borgne, F. Malrieu, and P-A. Zitt. Quantitative ergodicity for some switched dynamical systems. Electron. Commun. Probab. , 17:14 pp., 2012.
- 4[4] P. Brémaud. Point Processes and Queues, Martingale Dynamics . Springer-Verlag New York Inc, 1981.
- 5[5] M.H.A. Davis. Piecewise-deterministic Markov processes: A general class of non-diffusion stochastic models. Journal of the Royal statistical Society , 46:353–388, 1984.
- 6[6] M.H.A. Davis. Markov Models and Optimization . Chapman and Hall, London, 1993.
- 7[7] S. Dereich. Multilevel Monte Carlo algorithms for Lévy-driven SD Es with Gaussian correction. The Journal of Applied Probability , 21(1):283–311, 2011.
- 8[8] S. Dereich and F. Heidenreich. A Multilevel Monte Carlo algorithm for Lévy-driven Stochastic Differential Equations. Stochastic Processes and their Applications , 121(7):1565–1587, 2011.
