Thinning and Multilevel Monte Carlo for Piecewise Deterministic (Markov)   Processes. Application to a stochastic Morris-Lecar model

Vincent Lemaire (LPSM UMR 8001); Mich\`ele Thieullen (LPSM UMR 8001),; Nicolas Thomas (LPSM UMR 8001)

arXiv:1812.08431·math.PR·February 10, 2022

Thinning and Multilevel Monte Carlo for Piecewise Deterministic (Markov) Processes. Application to a stochastic Morris-Lecar model

Vincent Lemaire (LPSM UMR 8001), Mich\`ele Thieullen (LPSM UMR 8001),, Nicolas Thomas (LPSM UMR 8001)

PDF

Open Access

TL;DR

This paper develops thinning-based approximation and multilevel Monte Carlo methods for Piecewise Deterministic Processes, demonstrating improved efficiency in simulating a stochastic Morris-Lecar neuron model.

Contribution

It introduces a novel MLMC approach using thinning for PDPs and applies it to a complex neuron model, showing enhanced simulation performance.

Findings

01

MLMC with thinning outperforms classical Monte Carlo in simulations

02

Strong and weak error estimates for PDPs and PDMPs are established

03

Application to a 2D Morris-Lecar model demonstrates practical benefits

Abstract

In the first part of this paper we study approximations of trajectories of Piecewise Deter-ministic Processes (PDP) when the flow is not explicit by the thinning method. We also establish a strong error estimate for PDPs as well as a weak error expansion for Piecewise Deterministic Markov Processes (PDMP). These estimates are the building blocks of the Multilevel Monte Carlo (MLMC) method which we study in the second part. The coupling required by MLMC is based on the thinning procedure. In the third part we apply these results to a 2-dimensional Morris-Lecar model with stochastic ion channels. In the range of our simulations the MLMC estimator outperforms the classical Monte Carlo one.

Tables4

Table 1. Table 1 : Optimal parameters for the MLMC estimator ( 41 ).

L

⌈ 1 + \frac{\log ({| c_{1} |}^{\frac{1}{α}} ​ h^{*})}{\log (M)} + \frac{\log (A / ϵ)}{α ​ \log (M)} ⌉

,

A = \sqrt{1 + 2 ​ α}

q

q_{1} = μ^{*} ​ (1 + ρ ​ {(h^{*})}^{\frac{β}{2}})

q_{j} = μ^{*} ​ ρ ​ {(h^{*})}^{\frac{β}{2}} ​ (\frac{n_{j - 1}^{\frac{- β}{2}} + n_{j}^{\frac{- β}{2}}}{\sqrt{n_{j - 1} + n_{j}}}), j = 2, \dots, L; μ^{*} = 1 / \sum_{1 \leq j \leq L} q_{j}

N

(1 + \frac{1}{2 ​ α}) ​ \frac{Var ​ (X) ​ {(1 + ρ ​ {(h^{*})}^{\frac{β}{2}} ​ \sum_{j = 1}^{L} (n_{j - 1}^{\frac{- β}{2}} + n_{j}^{\frac{- β}{2}}) ​ \sqrt{n_{j - 1} + n_{j}})}^{2}}{ϵ^{2} ​ \sum_{j = 1}^{L} q_{j} ​ (n_{j - 1} + n_{j})}

Table 2. Table 2 : Results and parameters of the Monte Carlo estimator Y MC superscript 𝑌 MC Y^{\text{MC}} . Estimated values of the structural parameters: c 1 = 4.58 subscript 𝑐 1 4.58 c_{1}=4.58 , V 1 = 7.25 subscript 𝑉 1 7.25 V_{1}=7.25 .

$k$	$ϵ = 2^{- k}$	${\hat{ϵ}}_{100}$	${\hat{b}}_{100}$	${\hat{v}}_{100}$	time (sec)	$N$	$h$	cost
1	5.00e-01	4.32e-01	2.34e-01	1.52e-01	3.10e-01	2.16e+03	6.30e-02	3.43e+04
2	2.50e-01	2.59e-01	1.69e-01	3.87e-02	1.55e+00	8.47e+03	3.15e-02	2.69e+05
3	1.25e-01	1.17e-01	6.25e-02	9.78e-03	8.80e+00	3.34e+04	1.58e-02	2.12e+06
4	6.25e-02	5.67e-02	2.73e-02	2.47e-03	5.62e+01	1.32e+05	7.88e-03	1.68e+07
5	3.12e-02	2.50e-02	-1.78e-03	6.21e-04	3.93e+02	5.24e+05	3.94e-03	1.33e+08

Table 3. Table 3 : Results and parameters of the Multilevel Monte Carlo estimator Y MLMC superscript 𝑌 MLMC Y^{\text{MLMC}} . Estimated values of the structural parameters: c 1 = 4.58 subscript 𝑐 1 4.58 c_{1}=4.58 , V 1 = 7.25 subscript 𝑉 1 7.25 V_{1}=7.25 .

$k$	$ϵ = 2^{- k}$	${\hat{ϵ}}_{100}$	${\hat{b}}_{100}$	${\hat{v}}_{100}$	time (sec)	$L$	$M$	$h$	$N$	cost
1	5.00e-01	3.89e-01	1.14e-01	1.38e-01	3.62e-01	2	2	0.1	2.60e+03	2.82e+04
2	2.50e-01	2.29e-01	1.19e-01	3.83e-02	1.44e+00	2	4	0.1	1.04e+04	1.16e+05
3	1.25e-01	1.21e-01	6.24e-02	1.07e-02	5.76e+00	2	7	0.1	4.22e+04	4.85e+05
4	6.25e-02	5.91e-02	1.38e-02	3.30e-03	2.69e+01	3	4	0.1	1.90e+05	2.37e+06
5	3.12e-02	3.47e-02	-1.39e-02	1.01e-03	1.08e+02	3	6	0.1	7.71e+05	9.99e+06

Table 4. Table 4 : Results and parameters of the Multilevel Monte Carlo estimator Y ~ MLMC superscript ~ 𝑌 MLMC \tilde{Y}^{\text{MLMC}} (case 3). Estimated values of the structural parameters: c ~ 1 = 3.91 subscript ~ 𝑐 1 3.91 \tilde{c}_{1}=3.91 , V ~ 1 = 34.1 subscript ~ 𝑉 1 34.1 \tilde{V}_{1}=34.1 .

$k$	$ϵ = 2^{- k}$	${\hat{ϵ}}_{100}$	${\hat{b}}_{100}$	${\hat{v}}_{100}$	time (sec)	$L$	$M$	$h$	$N$	cost
1	5.00e-01	4.28e-01	1.98e-01	1.44e-01	3.13e-01	2	2	0.1	2.38e+03	2.50e+04
2	2.50e-01	2.47e-01	1.55e-01	3.72e-02	1.26e+00	2	3	0.1	9.46e+03	1.00e+05
3	1.25e-01	1.36e-01	8.90e-02	1.05e-02	5.00e+00	2	6	0.1	3.80e+04	4.11e+05
4	6.25e-02	6.22e-02	2.15e-02	3.41e-03	2.09e+01	3	4	0.1	1.58e+05	1.75e+06
5	3.12e-02	3.17e-02	6.07e-03	9.71e-04	8.35e+01	3	5	0.1	6.30e+05	7.02e+06

Equations287

\exists V_{1} > 0, V_{2} > 0, E [∣ F (\overline{x}_{T}) - F (x_{T}) ∣^{2}] \leq V_{1} h + V_{2} h^{2} .

\exists V_{1} > 0, V_{2} > 0, E [∣ F (\overline{x}_{T}) - F (x_{T}) ∣^{2}] \leq V_{1} h + V_{2} h^{2} .

\exists c_{1} > 0, E [F (\overline{x}_{T})] - E [F (x_{T})] = c_{1} h + o (h^{2}) .

\exists c_{1} > 0, E [F (\overline{x}_{T})] - E [F (x_{T})] = c_{1} h + o (h^{2}) .

\exists c_{1} > 0, α > 0, E [X_{h}] - E [X] = c_{1} h^{α} + o (h^{2 α}),

\exists c_{1} > 0, α > 0, E [X_{h}] - E [X] = c_{1} h^{α} + o (h^{2 α}),

\exists V_{1} > 0, β > 0, E [∣ X_{h} - X ∣^{2}] \leq V_{1} h^{β} .

\exists V_{1} > 0, β > 0, E [∣ X_{h} - X ∣^{2}] \leq V_{1} h^{β} .

Y = \frac{1}{N} k = 1 \sum N X_{h}^{k},

Y = \frac{1}{N} k = 1 \sum N X_{h}^{k},

E [X_{h_{L}}] = E [X_{h^{*}}] + l = 2 \sum L E [X_{h_{l}} - X_{h_{l - 1}}] .

E [X_{h_{L}}] = E [X_{h^{*}}] + l = 2 \sum L E [X_{h_{l}} - X_{h_{l - 1}}] .

Y = \frac{1}{N _{1}} k = 1 \sum N_{1} X_{h^{*}}^{k} + l = 2 \sum L \frac{1}{N _{l}} k = 1 \sum N_{l} (X_{h_{l}}^{k} - X_{h_{l - 1}}^{k}),

Y = \frac{1}{N _{1}} k = 1 \sum N_{1} X_{h^{*}}^{k} + l = 2 \sum L \frac{1}{N _{l}} k = 1 \sum N_{l} (X_{h_{l}}^{k} - X_{h_{l - 1}}^{k}),

E [F (x_{T})] = E [F (\tilde{x}_{T}) \tilde{R}_{T}] .

E [F (x_{T})] = E [F (\tilde{x}_{T}) \tilde{R}_{T}] .

\exists \tilde{V}_{1} > 0, E [∣ F (\underline{\tilde{x}}_{T}) \underline{\tilde{R}}_{T} - F (\tilde{x}_{T}) \tilde{R}_{T} ∣^{2}] \leq \tilde{V}_{1} h^{2},

\exists \tilde{V}_{1} > 0, E [∣ F (\underline{\tilde{x}}_{T}) \underline{\tilde{R}}_{T} - F (\tilde{x}_{T}) \tilde{R}_{T} ∣^{2}] \leq \tilde{V}_{1} h^{2},

Q (x, A \times B) = Q (x, A) δ_{ν} (B) .

Q (x, A \times B) = Q (x, A) δ_{ν} (B) .

Q ((θ_{x}, Φ_{θ_{x}} (t, ν_{x})), d θ d ν) = Q ((θ_{x}, Φ_{θ_{x}} (t, ν_{x})), d θ) δ_{Φ_{θ_{x}} (t, ν_{x})} (d ν) .

Q ((θ_{x}, Φ_{θ_{x}} (t, ν_{x})), d θ d ν) = Q ((θ_{x}, Φ_{θ_{x}} (t, ν_{x})), d θ) δ_{Φ_{θ_{x}} (t, ν_{x})} (d ν) .

T_{1} := T_{τ_{1}}^{*},

T_{1} := T_{τ_{1}}^{*},

τ_{1} := in f {k > 0 : U_{k} λ^{*} \leq λ (θ_{0}, Φ_{θ_{0}} (T_{k}^{*}, ν_{0}))} .

τ_{1} := in f {k > 0 : U_{k} λ^{*} \leq λ (θ_{0}, Φ_{θ_{0}} (T_{k}^{*}, ν_{0}))} .

a_{j} (x) := i = 1 \sum j Q (x, {k_{i}}), \forall x \in E .

a_{j} (x) := i = 1 \sum j Q (x, {k_{i}}), \forall x \in E .

H (x, u) := i = 1 \sum ∣Θ∣ k_{i} \mathds 1_{a_{i - 1} (x) < u \leq a_{i} (x)}, \forall x \in E, \forall u \in [0, 1] .

H (x, u) := i = 1 \sum ∣Θ∣ k_{i} \mathds 1_{a_{i - 1} (x) < u \leq a_{i} (x)}, \forall x \in E, \forall u \in [0, 1] .

(θ_{1}, ν_{1})

(θ_{1}, ν_{1})

k \in Θ \sum Q ((θ_{0}, Φ_{θ_{0}} (T_{τ_{1}}^{*}, ν_{0})), {k}) δ_{(k, ϕ_{θ_{0}} (T_{τ_{1}}^{*}, ν_{0}))} .

k \in Θ \sum Q ((θ_{0}, Φ_{θ_{0}} (T_{τ_{1}}^{*}, ν_{0})), {k}) δ_{(k, ϕ_{θ_{0}} (T_{τ_{1}}^{*}, ν_{0}))} .

T_{n} := T_{τ_{n}}^{*},

T_{n} := T_{τ_{n}}^{*},

τ_{n} := in f {k > τ_{n - 1} : U_{k} λ^{*} \leq λ (θ_{n - 1}, Φ_{θ_{n - 1}} (T_{k}^{*} - T_{τ_{n - 1}}^{*}, ν_{n - 1}))} .

τ_{n} := in f {k > τ_{n - 1} : U_{k} λ^{*} \leq λ (θ_{n - 1}, Φ_{θ_{n - 1}} (T_{k}^{*} - T_{τ_{n - 1}}^{*}, ν_{n - 1}))} .

(θ_{n}, ν_{n})

(θ_{n}, ν_{n})

x_{t} := (θ_{n}, Φ_{θ_{n}} (t - T_{n}, ν_{n})), t \in [T_{n}, T_{n + 1} [.

x_{t} := (θ_{n}, Φ_{θ_{n}} (t - T_{n}, ν_{n})), t \in [T_{n}, T_{n + 1} [.

t \in [0, T] sup ∣ Φ_{θ} (t, ν_{1}) - \overline{Φ}_{θ} (t, ν_{2}) ∣ \leq e^{C_{1} T} ∣ ν_{1} - ν_{2} ∣ + C_{2} h, \forall θ \in Θ, \forall (ν_{1}, ν_{2}) \in R^{2} .

t \in [0, T] sup ∣ Φ_{θ} (t, ν_{1}) - \overline{Φ}_{θ} (t, ν_{2}) ∣ \leq e^{C_{1} T} ∣ ν_{1} - ν_{2} ∣ + C_{2} h, \forall θ \in Θ, \forall (ν_{1}, ν_{2}) \in R^{2} .

\left\{\begin{array}[]{ll}\beta_{n}=\Phi_{\alpha_{n-1}}(t_{n}-t_{n-1},\beta_{n-1}),\\ \beta_{0}=\nu,\end{array}\right.\hskip 14.22636pt\text{and}\hskip 14.22636pt\left\{\begin{array}[]{ll}\overline{\beta}_{n}=\overline{\Phi}_{\alpha_{n-1}}(t_{n}-t_{n-1},\overline{\beta}_{n-1}),\\ \overline{\beta}_{0}=\nu.\end{array}\right.

\left\{\begin{array}[]{ll}\beta_{n}=\Phi_{\alpha_{n-1}}(t_{n}-t_{n-1},\beta_{n-1}),\\ \beta_{0}=\nu,\end{array}\right.\hskip 14.22636pt\text{and}\hskip 14.22636pt\left\{\begin{array}[]{ll}\overline{\beta}_{n}=\overline{\Phi}_{\alpha_{n-1}}(t_{n}-t_{n-1},\overline{\beta}_{n-1}),\\ \overline{\beta}_{0}=\nu.\end{array}\right.

∣ \overline{β}_{n} - β_{n} ∣ \leq e^{C_{1} t_{n}} n C_{2} h,

∣ \overline{β}_{n} - β_{n} ∣ \leq e^{C_{1} t_{n}} n C_{2} h,

\overline{β}_{k} - β_{k} \leq e^{C_{1} (t_{k} - t_{k - 1})} ∣ \overline{β}_{k - 1} - β_{k - 1} ∣ + C_{2} h,

\overline{β}_{k} - β_{k} \leq e^{C_{1} (t_{k} - t_{k - 1})} ∣ \overline{β}_{k - 1} - β_{k - 1} ∣ + C_{2} h,

e^{- C_{1} t_{k}} \overline{β}_{k} - β_{k} \leq e^{- C_{1} t_{k - 1}} ∣ \overline{β}_{k - 1} - β_{k - 1} ∣ + C_{2} h .

e^{- C_{1} t_{k}} \overline{β}_{k} - β_{k} \leq e^{- C_{1} t_{k - 1}} ∣ \overline{β}_{k - 1} - β_{k - 1} ∣ + C_{2} h .

\overline{β}_{n} - β_{n} \leq e^{C_{1} t_{n}} n C_{2} h .

\overline{β}_{n} - β_{n} \leq e^{C_{1} t_{n}} n C_{2} h .

\left\{\begin{array}[]{l}\frac{dy(t)}{dt}=f_{\theta}\left(y(t)\right),\\ y(0)=\nu,\end{array}\right.

\left\{\begin{array}[]{l}\frac{dy(t)}{dt}=f_{\theta}\left(y(t)\right),\\ y(0)=\nu,\end{array}\right.

\left\{\begin{array}[]{l}\overline{y}_{i+1}(x)=\overline{y}_{i}(x)+hf_{\theta}(\overline{y}_{i}(x)),\\ \overline{y}_{0}(x)=\nu,\end{array}\right.

\left\{\begin{array}[]{l}\overline{y}_{i+1}(x)=\overline{y}_{i}(x)+hf_{\theta}(\overline{y}_{i}(x)),\\ \overline{y}_{0}(x)=\nu,\end{array}\right.

\overline{ϕ}_{θ} (t, ν) := \overline{y}_{i} (x) + (t - \overline{t}_{i}) f_{θ} (\overline{y}_{i} (x)), \forall t \in [\overline{t}_{i}, \overline{t}_{i + 1}] .

\overline{ϕ}_{θ} (t, ν) := \overline{y}_{i} (x) + (t - \overline{t}_{i}) f_{θ} (\overline{y}_{i} (x)), \forall t \in [\overline{t}_{i}, \overline{t}_{i + 1}] .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsProbability and Risk Models · Stochastic processes and financial applications · Markov Chains and Monte Carlo Methods

Full text

Thinning and Multilevel Monte Carlo for Piecewise Deterministic (Markov) Processes.

Application to a stochastic Morris-Lecar model.

Vincent Lemaire [email protected] Michèle Thieullen11footnotemark: 1 [email protected] Nicolas Thomas11footnotemark: 1 [email protected] Laboratoire de Probabilités, Statistique et Modélisation (LPSM), UMR CNRS 8001, Sorbonne Université-Campus Pierre et Marie Curie, Case 158, 4 place Jussieu, F-75252 Paris Cedex 5, France

Abstract

In the first part of this paper we study approximations of trajectories of Piecewise Deterministic Processes (PDP) when the flow is not explicit by the thinning method. We also establish a strong error estimate for PDPs as well as a weak error expansion for Piecewise Deterministic Markov Processes (PDMP). These estimates are the building blocks of the Multilevel Monte Carlo (MLMC) method which we study in the second part. The coupling required by MLMC is based on the thinning procedure. In the third part we apply these results to a 2-dimensional Morris-Lecar model with stochastic ion channels. In the range of our simulations the MLMC estimator outperforms the classical Monte Carlo one.

Keywords: Piecewise Deterministic (Markov) Processes, Multilevel Monte Carlo, Thinning, Strong error estimate, Weak error expansion, Morris-Lecar model.

Mathematics Subject Classification: 65C05, 65C20, 60G55, 60J25, 68U20

1 Introduction

In this paper we are interested in the approximation of the trajectories of PDPs. We establish strong error estimates for a PDP and a weak error expansion for a PDMP. Then we study the application of the Multilevel Monte Carlo (MLMC) method in order to approximate expectations of functional of PDMPs. Our motivation comes from Neuroscience where the whole class of stochastic conductance-based neuron models can be interpreted as PDMPs. The response of a neuron to a stimulus, called neural coding, is considered as a relevant information to understand the functional properties of such excitable cells. Thus many quantities of interest such as mean first spike latency, mean interspike intervals and mean firing rate can be modelled as expectations of functionals of PDMPs.

PDPs have been introduced by Davis in [5] as a general class of stochastic processes characterized by a deterministic evolution between two successive random times. In the case where the deterministic evolution part follows a family of Ordinary Differential Equations (ODEs) the corresponding PDP enjoys the Markov property and is called a PDMP. The distribution of a PDMP is thus determined by three parameters called the characteristics of the PDMP: a family of vector fields, a jump rate (intensity function) and a transition measure.

We consider first a general PDP $(x_{t})$ which is not necessarily Markov on a finite time interval $[0,T]$ for which the flow is not explicitly solvable. Approximating its flows by the classical Euler scheme and using our previous work [22], we build a thinning algorithm which provides us with an exact simulation of an approximation of $(x_{t})$ that we denote $(\overline{x}_{t})$ . The process $(\overline{x}_{t})$ is a PDP constructed by thinning of a homogeneous Poisson process which enjoys explicitly solvable flows.

Actually this thinning construction provides a whole family of approximations indexed by the time step $h>0$ of the Euler scheme. We prove that for any real valued smooth function $F$ the following strong estimate holds

[TABLE]

Moreover if $(x_{t})$ is a PDMP the following weak error expansion holds

[TABLE]

The estimate (1) is mainly based on the construction of the couple $(x_{t},\overline{x}_{t})$ and on the fact that the Euler scheme is of order 1 this is why it is valid for a general PDP and its Euler scheme. On the contrary, the estimate (2) relies on properties which are specific to PDMPs such as the Feynman-Kac formula.

The MLMC method relies simultaneously on estimates (1) and (2) that is why we study its application to the PDMP framework instead of the more general PDP one. MLMC extends the classical Monte Carlo (MC) method which is a very general approach to estimate expectations using stochastic simulations. The complexity (i.e the number of operations necessary in the simulation) associated to a MC estimation can be prohibitive especially when the complexity of an individual random sample is very high. MLMC relies on repeated independent random samplings taken on different levels of accuracy which differs from the classical MC method. MLMC can then greatly reduces the complexity of the classical MC by performing most simulations with low accuracy but with low complexity and only few simulations with high accuracy at high complexity. MLMC have been introduced by S. Heinrich in [18] and developed by M. Giles in [12]. The MLMC estimator has been efficiently used in various fields of numerical probability such as SDEs [12], Markov chains [1], [2], [14], Lévy processes [10], jump diffusions [28], [7], [8] or nested Monte Carlo [21], [13]. See [11] for more references. To the best of our knowledge, application of MLMC to PDMPs has not been considered.

For the sake of clarity, we describe here the general improvement of MLMC. We are interested in the estimation of $\mathbb{E}[X]$ where $X$ is a real valued square integrable random variable on a probability space $\left(\Omega,\mathcal{F},\mathbb{P}\right)$ . When $X$ can be simulated exactly the classical MC estimator $(1/N)\sum_{k=1}^{N}X^{k}$ with $X^{k},k\geq 1$ independent random variables identically distributed as $X$ , provides an unbiased estimator. The associated L2 - error satisfies $\parallel Y-\mathbb{E}[X]\parallel_{2}^{2}=\text{Var}(Y)=\frac{1}{N}\text{Var}(X)$ . If we quantify the precision by the L2 - error, then a user-prescribed precision $\epsilon^{2}>0$ is achieved for $N=O(\epsilon^{-2})$ so that in this case the global complexity is of order $O(\epsilon^{-2})$ .

Assume now that $X$ cannot be simulated exactly (or cannot be simulated at a reasonable cost) and that we can build a family of real valued random variables $(X_{h},h>0)$ on $\left(\Omega,\mathcal{F},\mathbb{P}\right)$ which converges weakly and strongly to $X$ as $h\rightarrow 0$ in the following sense

[TABLE]

and

[TABLE]

Assume moreover that for $h>0$ the random variable $X_{h}$ can be simulated at a reasonable complexity (the complexity increases as $h\rightarrow 0$ ). The classical MC estimator now consists in a sequence of random variables

[TABLE]

where $X^{k}_{h},k\geq 1$ are independent random variables identically distributed as $X_{h}$ . The bias and the variance of the estimator (5) are respectively given by $\mathbb{E}[Y]-\mathbb{E}[X]=\mathbb{E}[X_{h}]-\mathbb{E}[X]\simeq c_{1}h^{\alpha}$ and $\text{Var}(Y)=\frac{1}{N}\text{Var}(X_{h})$ . From the strong estimate (4) we have that $\text{Var}(X_{h})\rightarrow\text{Var}(X)$ as $h\rightarrow 0$ so that $\text{Var}(X_{h})$ is asymptotically a constant independent of $h$ . If as above we quantify the precision by the L2 - error and use that $\parallel Y-\mathbb{E}[X]\parallel_{2}^{2}=(\mathbb{E}[Y]-\mathbb{E}[X])^{2}+\text{Var}(Y)$ , we obtain that the estimator (5) achieves a user-prescribed precision $\epsilon^{2}>0$ for $h=O(\epsilon^{1/\alpha})$ and $N=O(\epsilon^{-2})$ so that the global complexity of the estimator is now $O(\epsilon^{-2-\frac{1}{\alpha}})$ .

The MLMC method takes advantage of the estimate (4) in order to reduce the global complexity. Let us fix $L\geq 2$ and consider for $l\in\{1,\ldots,L\}$ a geometrically decreasing sequence $(h_{l},1\leq l\leq L)$ where $h_{l}=h^{*}M^{-(l-1)}$ for fixed $h^{*}>0$ and $M>1$ . The indexes $l$ are called the levels of the MLMC and the complexity of $X_{h_{l}}$ increases as the level increases. Thanks to the weak expansion (3), the quantity $\mathbb{E}[X_{h_{L}}]$ approximates $\mathbb{E}[X]$ . Using the linearity of the expectation the quantity $\mathbb{E}[X_{h_{L}}]$ can be decomposed over the levels $l\in\{1,\ldots,L\}$ as follows

[TABLE]

For each level $l\in\{1,\ldots,L\}$ , a classical MC estimator is used to approximate $\mathbb{E}[X_{h_{l}}-X_{h_{l-1}}]$ and $\mathbb{E}[X_{h^{*}}]$ . At each level, a number $N_{l}\geq 1$ of samples are required and the key point is that the random variables $X_{h_{l}}$ and $X_{h_{l-1}}$ are assumed to be correlated in order to make the variance of $X_{h_{l}}-X_{h_{l-1}}$ small. Considering at each level $l=2,\ldots,L$ independent couples $(X_{h_{l}},X_{h_{l-1}})$ of correlated random variables, the MLMC estimator then reads

[TABLE]

where $(X_{h^{*}}^{k},k\geq 1)$ is a sequence of independent and identically distributed random variables distributed as $X_{h^{*}}$ and $\left((X^{k}_{h_{l}},X^{k}_{h_{l-1}}),k\geq 1\right)$ for $l=2,\ldots,L$ are independent sequences of independent copies of $(X_{h_{l}},X_{h_{l-1}})$ and independent of $(X_{h^{*}}^{k})$ . It is known, see [12] or [21], that given a precision $\epsilon>0$ and provided that the family $(X_{h},h>0)$ satisfies the strong and weak error estimates (4) and (3), the multilevel estimator (7) achieves a precision $\parallel Y-\mathbb{E}[X]\parallel_{2}^{2}=\epsilon^{2}$ with a global complexity of order $O(\epsilon^{-2})$ if $\beta>1$ , $O(\epsilon^{-2}(\log(\epsilon))^{2})$ if $\beta=1$ and $O(\epsilon^{-2-(1-\beta)/\alpha})$ if $\beta<1$ . This complexity result shows the importance of the parameter $\beta$ . Finally, let us mention that in the case $\beta>1$ it possible to build an unbiased multilevel estimator, see [15].

Estimates (1) and (2) suggest to investigate the use of the MLMC method in the PDMP framework with $\beta=1$ and $\alpha=1$ . Letting $X=F(x_{T})$ and $X_{h}=F(\overline{x}_{T})$ for $h>0$ and $F$ a smooth function, we define a MLMC estimator of $\mathbb{E}[F(x_{T})]$ just as in (7) (noted $Y^{\text{MLMC}}$ in the paper) where the processes involved at the level $l$ are correlated by thinning. Since these processes are constructed using two different time steps, the probability of accepting a proposed jump time differs from one process to the other. Moreover the discrete components of the post-jump locations may also be different. This results in the presence of the term $V_{1}h$ in the estimate (1). In order to improve the convergence rate (to increase the parameter $\beta$ ) in (1), we show that for a given PDMP $(x_{t})$ we have the following auxiliary representation

[TABLE]

The PDMP $(\tilde{x}_{t})$ and its Euler scheme are such that their discrete components jump at the same times and in the same state. $(\tilde{R}_{t})$ is a process which depends on $(\tilde{x}_{t},t\in[0,T])$ . The representation (8) is inspired by the change of probability introduced in [28] and is actually valid for a general PDP (Proposition 2.2) so that $\mathbb{E}[F(\overline{x}_{T})]=\mathbb{E}[F(\underline{\tilde{x}}_{T})\underline{\tilde{R}}_{T}]$ where $(\underline{\tilde{x}}_{t})$ is the Euler scheme corresponding to $(\tilde{x}_{t})$ and $(\underline{\tilde{R}}_{t})$ is a process which depends on $(\underline{\tilde{x}}_{t},t\in[0,T])$ . Letting $X=F(\tilde{x}_{T})\tilde{R}_{T}$ and $X_{h}=F(\underline{\tilde{x}}_{T})\underline{\tilde{R}}_{T}$ we define a second MLMC estimator (noted $\tilde{Y}^{\text{MLMC}}$ ) where now the discrete components of the Euler schemes $(\underline{\tilde{x}}_{t})$ involved at the level $l$ always jump in the same states and at the same times. To sum up, the first MLMC estimator we consider ( $Y^{\text{MLMC}}$ ) derives from (6) where the corrective term at level $l$ is $\mathbb{E}[F(\overline{x}_{T}^{h_{l}})-F(\overline{x}_{T}^{h_{l-1}})]$ whereas the corrective term of the second estimator ( $\tilde{Y}^{\text{MLMC}}$ ) is $\mathbb{E}[F(\underline{\tilde{x}}_{T}^{h_{l}})\underline{\tilde{R}}_{T}^{h_{l}}-F(\underline{\tilde{x}}_{T}^{h_{l-1}})\underline{\tilde{R}}_{T}^{h_{l-1}}]$ . For readability, we no longer write the dependence of the approximations on the time step. For the processes $(F(\underline{\tilde{x}}_{t})\underline{\tilde{R}}_{t})$ and $(F(\tilde{x}_{t})\tilde{R}_{t}$ ) we show the following strong estimate

[TABLE]

so that we end up with $\beta=2$ and the complexity goes from a $O(\epsilon^{-2}(\log(\epsilon))^{2})$ to a $O(\epsilon^{-2})$ .

As an application we consider the PDMP version of the 2-dimensional Morris-Lecar model, see [25], which takes into account the precise description of the ionic channels and in which the flows are not explicit. Let us mention [3] for the application of quantitative bounds for the long time behavior of PDMPs to a stochastic 3-dimensional Morris-Lecar model. The original deterministic Morris-Lecar model has been introduced in [23] to account for various oscillating states in the barnacle giant muscle fiber. Because of its low dimension, this model is among the favourite conductance-based models in computational Neuroscience. Furthermore, this model is particularly interesting because it reproduces some of the main features of excitable cells response such as the shape, amplitude and threshold of the action potential, the refractory period. We compare the classical MC and the MLMC estimators on the 2-dimensional stochastic Morris-Lecar model to estimate the mean value of the membrane potential at fixed time. It turns out that in the range of our simulations the MLMC estimator outperforms the MC one. It suggests that MLMC estimators can be used successfully in the framework of PDMPs.

As mentioned above, the quantities of interest such as mean first spike latency, mean interspike intervals and mean firing rate can be modelled as expectations of path-dependent functional of PDMPs. This setting can then be considered as a natural extension of this work.

The paper is organised as follows. In section 2, we construct a general PDP by thinning and we give a representation of its distribution in term of the thinning data (Proposition 1). In section 3, we establish strong error estimates (Theorems 1-2). In section 4, we establish a weak error expansion (Theorem 3). In section 5, we compare the efficiency of the classical and the multilevel Monte Carlo estimators on the 2-dimensional stochastic Morris-Lecar model.

2 Piecewise Deterministic Process by thinning

2.1 Construction

In this section we introduce the setting and recall some results on the thinning method from our previous paper [22]. Let $E:=\Theta\times\mathbb{R}^{d}$ where $\Theta$ is a finite or countable set and $d\geq 1$ . A piecewise deterministic process (PDP) is defined from the following characteristics

•

a family of functions $\left(\Phi_{\theta}\right)_{\theta\in\Theta}$ such that $\Phi_{\theta}:\mathbb{R}_{+}\times\mathbb{R}^{d}\rightarrow\mathbb{R}^{d}$ for all $\theta\in\Theta$ ,

•

a measurable function $\lambda:E\rightarrow]0,+\infty[$ ,

•

a transition measure $Q:E\times\mathcal{B}(E)\rightarrow[0,1]$ .

We denote by $x=(\theta,\nu)$ a generic element of $E$ . We only consider PDPs with continuous $\nu$ -component so that for $A\in\mathcal{B}(\Theta)$ and $B\in\mathcal{B}(\mathbb{R}^{d})$ , we write

[TABLE]

If we write $x=(\theta_{x},\nu_{x})$ , then it holds that

[TABLE]

Our results do not depend on the dimension of the variable in $\mathbb{R}^{d}$ so we restrict ourself to $\mathbb{R}$ ( $d=1$ ) for the readability. We work under the following assumption

Assumption 2.1.

There exists $\lambda^{*}<+\infty$ such that, for all $x\in E$ , $\lambda(x)\leq\lambda^{*}$ .

In [22] we considered a general upper bound $\lambda^{*}$ . In the present paper $\lambda^{*}$ is constant (see Assumption 2.1). Let $\left(\Omega,\mathcal{F},\mathbb{P}\right)$ be a probability space on which we define

an homogeneous Poisson process $(N^{*}_{t},t\geq 0)$ with intensity $\lambda^{*}$ (given in Assumption 2.1) whose successive jump times are denoted $\left(T^{*}_{k},k\geq 1\right)$ . We set $T^{*}_{0}=0$ . 2. 2.

two sequences of iid random variables with uniform distribution on $[0,1]$ , $(U_{k},k\geq 1)$ and $(V_{k},k\geq 1)$ independent of each other and independent of $\left(T^{*}_{k},k\geq 1\right)$ .

Given $T>0$ we construct iteratively the sequence of jump times and post-jump locations $(T_{n},(\theta_{n},\nu_{n}),n\geq 0)$ of the $E$ -valued PDP $(x_{t},t\in[0,T])$ that we want to obtain in the end using its characteristics $\left(\Phi,\lambda,Q\right)$ . Let $(\theta_{0},\nu_{0})\in E$ be fixed and let $T_{0}=0$ . We construct $T_{1}$ by thinning of $(T^{*}_{k})$ , that is

[TABLE]

where

[TABLE]

We denote by $|\Theta|$ the cardinal of $\Theta$ (which may be infinite) and we set $\Theta=\{k_{1},\ldots,k_{|\Theta|}\}$ . For $j\in\{1,\ldots,|\Theta|\}$ we introduce the functions $a_{j}$ defined on $E$ by

[TABLE]

By convention, we set $a_{0}:=0$ . We also introduce the function $H$ defined by

[TABLE]

For all $x\in E$ , $H(x,.)$ is the inverse of the cumulative distribution function of $Q(x,.)$ (see for example [9]). Then, we construct $(\theta_{1},\nu_{1})$ from the uniform random variable $V_{1}$ and the function $H$ as follows

[TABLE]

Thus, the distribution of $(\theta_{1},\nu_{1})$ given $\left(\tau_{1},(T^{*}_{k})_{k\leq\tau_{1}}\right)$ is $Q((\theta_{0},\Phi_{\theta_{0}}(T^{*}_{\tau_{1}},\nu_{0})),.)$ or in view of (9),

[TABLE]

For $n>1$ , assume that $\left(\tau_{n-1},(T^{*}_{k})_{k\leq\tau_{n-1}},(\theta_{n-1},\nu_{n-1})\right)$ is constructed. Then, we construct $T_{n}$ by thinning of $(T^{*}_{k})$ conditionally to $\left(\tau_{n-1},(T^{*}_{k})_{k\leq\tau_{n-1}},(\theta_{n-1},\nu_{n-1})\right)$ , that is

[TABLE]

where

[TABLE]

Then, we construct $(\theta_{n},\nu_{n})$ using the uniform random variable $V_{n}$ and the function $H$ as follows

[TABLE]

We define the PDP $x_{t}$ for all $t\in[0,T]$ from the process $(T_{n},(\theta_{n},\nu_{n}))$ by

[TABLE]

Thus, $x_{T_{n}}=(\theta_{n},\nu_{n})$ and $x^{-}_{T_{n}}=(\theta_{n-1},\nu_{n})$ . We also define the counting process associated to the jump times $N_{t}:=\sum_{n\geq 1}\mathds{1}_{T_{n}\leq t}$ .

2.2 Approximation of a PDP

In applications we may not know explicitly the functions $\Phi_{\theta}$ . In this case, we use a numerical scheme $\overline{\Phi}_{\theta}$ approximating $\Phi_{\theta}$ . In this paper, we consider schemes such that there exits positive constants $C_{1}$ and $C_{2}$ independent of $h$ and $\theta$ such that

[TABLE]

To the family $(\overline{\Phi}_{\theta})$ we can associate a PDP constructed as above that we denote $(\overline{x}_{t})$ . We emphasize that there is a positive probability that $(x_{t})$ and $(\overline{x}_{t})$ jump at different times and/or in different states even if they are both constructed from the same data $(N^{*}_{t})$ , $(U_{k})$ and $(V_{k})$ . However if the characteristics $(\Phi,\tilde{\lambda},\tilde{Q})$ of a PDP $(\tilde{x}_{t})$ are such that $\tilde{\lambda}$ and $\tilde{Q}$ depend only on $\theta$ , that is $\tilde{\lambda}(x)=\tilde{\lambda}(\theta)$ and $\tilde{Q}(x,.)=\tilde{Q}(\theta,.)$ for all $x=(\theta,\nu)\in E$ , then its embedded Markov chain $(\tilde{T}_{n},(\tilde{\theta}_{n},\tilde{\nu}_{n}),n\geq 0)$ is such that $(\tilde{\theta}_{n},n\geq 0)$ is an autonomous Markov chain with kernel $\tilde{Q}$ and $(\tilde{T}_{n},n\geq 0)$ is a counting process with intensity $\tilde{\lambda}_{t}=\sum_{n\geq 0}\tilde{\lambda}(\tilde{\theta}_{n})\mathds{1}_{\tilde{T}_{n}\leq t<\tilde{T}_{n+1}}$ . In particular, both $(\tilde{\theta}_{n})$ and $(\tilde{\tau}_{n})$ do not depend on $\Phi$ . The particular form of the characteristics $\tilde{\lambda}$ and $\tilde{Q}$ implies that the PDP $(\tilde{x}_{t})$ and its approximation $(\underline{\tilde{x}_{t}})$ are correlated via the same process $(\tilde{\tau}_{n},\tilde{\theta}_{n})$ . In other words, these processes always jump exactly at the same times and their $\theta$ -component always jump in the same states. Such processes $(\tilde{x}_{t})$ are easier theoretically as well as numerically than the general case. They will be useful for us in the sequel.

The following lemma (which is important for several proofs below) gives a direct consequence of the estimate (14).

Lemma 2.1.

Let $(\Phi_{\theta})$ and $(\overline{\Phi}_{\theta})$ satisfying (14). Let $(t_{n},n\geq 0)$ be an increasing sequence of non-negative real numbers with $t_{0}=0$ and let $(\alpha_{n},n\geq 0)$ be a sequence of $\Theta$ -valued components. For a given $\nu\in\mathbb{R}$ let us define iteratively the sequences $(\beta_{n},n\geq 0)$ and $(\overline{\beta}_{n},n\geq 0)$ as follows

[TABLE]

Then, for all $n\geq 1$ we have

[TABLE]

where $C_{1}$ and $C_{2}$ are positive constants independent of $h$ .

Proof of Lemma 2.1.

Let $n\geq 1$ . From the estimate (14), we have for all $k\leq n$

[TABLE]

and therefore

[TABLE]

By summing up these inequalities for $1\leq k\leq n$ and since $\beta_{0}=\overline{\beta}_{0}$ we obtain

[TABLE]

∎

2.3 Application to the construction of a PDMP and its associated Euler scheme

In this section we define a PDMP and its associated Euler scheme from the construction of the section 2.1. For all $\theta\in\Theta$ , we consider a family of vector fields $(f_{\theta},\theta\in\Theta)$ satisfying

Assumption 2.2.

For all $\theta\in\Theta$ , the function $f_{\theta}:\mathbb{R}\rightarrow\mathbb{R}$ is bounded and Lipschitz with constant $L$ independent of $\theta$ .

If we choose $\Phi_{\theta}=\phi_{\theta}$ in the above construction where for all $x=(\theta,\nu)\in E$ , we denote by $(\phi_{\theta}(t,\nu),t\geq 0)$ the unique solution of the ordinary differential equation (ODE)

[TABLE]

then the corresponding PDP is Markov since $\phi$ satisfies the semi-group property which reads $\phi_{\theta}(t+s,\nu)=\phi_{\theta}(t,\phi_{\theta}(s,\nu))$ for all $t,s\geq 0$ and for all $(\theta,\nu)\in E$ . In this case, the process $(x_{t})$ is a piecewise deterministic Markov process (see [6] or [20]).

Let $h>0$ . We approximate the solution of (15) by the Euler scheme with time step $h$ . First, we define the Euler subdivision of $[0,+\infty[$ with time step $h$ , noted $(\overline{t}_{i},i\geq 0)$ , by $\overline{t}_{i}:=ih$ .

Then, for all $x=(\theta,\nu)\in E$ , we define the sequence $(\overline{y}_{i}(x),i\geq 0)$ , the classical Euler scheme, iteratively by

[TABLE]

to emphasize its dependence on the initial condition. Finally, for all $x=(\theta,\nu)\in E$ , we set

[TABLE]

We construct the approximating process $(\overline{x}_{t})$ as follows. Its continuous component starts from $\nu_{0}$ at time 0 and follows the flow $\overline{\phi}_{\theta_{0}}(t,\nu_{0})$ until the first jump time $\overline{T}_{1}$ that we construct by (10) and (11) of section 2.1 where we replace $\Phi_{\theta_{0}}(T^{*}_{k},\nu_{0})$ by $\overline{\phi}_{\theta_{0}}(T^{*}_{k},\nu_{0})$ . At time $\overline{T}_{1}$ the continuous component of $\overline{x}_{\overline{T}_{1}}$ is equal to $\overline{\phi}_{\theta_{0}}(\overline{T}_{1},\nu_{0}):=\overline{\nu}_{1}$ since there is no jump in the continuous component. The discrete component jumps to $\overline{\theta}_{1}$ . We iterate this procedure with the new flow $\overline{\phi}_{\overline{\theta}_{1}}(t-\overline{T}_{1},\overline{\nu}_{1})$ until the next jump time $\overline{T}_{2}$ given by (10) and (11) with $\overline{\phi}_{\overline{\theta}_{1}}(T^{*}_{k}-\overline{T}_{1},\overline{\nu}_{1})$ and so on. We proceed by iteration to construct $(\overline{x}_{t})$ on $[0,T]$ .

Consequently, the discretisation grid for $(\overline{x}_{t})$ on the interval $[0,T]$ is random and is formed by the points $\overline{T}_{n}+kh$ for $n=0,\ldots,\overline{N}_{T}$ and $k=0,\ldots,\lfloor(\overline{T}_{n+1}\wedge T-\overline{T}_{n})/h\rfloor$ . This differs from the SDE case where the classical grid is fixed.

By classical results of numerical analysis (see [17] for example), the continuous Euler scheme (16) (also called Euler polygon) satisfies estimate (14). If we choose $\Phi_{\theta}=\overline{\phi}_{\theta}$ in the above construction then the corresponding PDP $(\overline{x}_{t})$ is not Markov since the functions $\overline{\phi}_{\theta}(.,\nu)$ do not satisfy the semi-group property (see [20]).

2.4 Thinning representation for the marginal distribution of a PDP

The sequence $(T_{n},(\theta_{n},\nu_{n}),n\geq 0)$ is an $\mathbb{R}_{+}\times E$ -valued Markov chain with respect to its natural filtration $\mathcal{F}_{n}$ and with kernel $K$ defined by

[TABLE]

For $n\geq 0$ , the law of the random variable $T_{n}-T_{n-1}$ given $\mathcal{F}_{n-1}$ admits the density given for $t\geq 0$ by

[TABLE]

Classically the marginal distribution of $x_{t}$ is expressed using (13), the intensity $\lambda$ via (18) and the kernel $K$ (see (17)). Indeed for fixed $x_{0}=x\in E$ and for any bounded measurable function $g$ we can write,

[TABLE]

where $K^{0}:=\delta$ and $K^{n}=K\circ\ldots\circ K$ $n$ times, that is

[TABLE]

However since we have constructed $(x_{t})$ by thinning, we would prefer to express the distribution of $x_{t}$ using the upper bound $\lambda^{*}$ , the Poisson process $(N^{*}_{t},t\geq 0)$ and the sequences $(U_{k},k\in{\mathbb{N}})$ , $(V_{k},k\in{\mathbb{N}})$ .

Proposition 2.1.

Let $(x_{t},t\in[0,T])$ be a PDP with characteristics $(\Phi,\lambda,Q)$ constructed in section 2.1 and let $n\in\mathbb{N}$ . Then

[TABLE]

The following proposition and its corollaries will be useful in section 3. In their statements $(x_{t},t\in[0,T])$ and $(\tilde{x}_{t},t\in[0,T])$ are PDPs constructed in section 2.1 using the same data $(N^{*}_{t})$ , $(U_{k})$ , $(V_{k})$ and the same initial point $x\in E$ but with different sets of characteristics.

The following results are inspired by the change of probability introduced in [28] where the authors are interested in the application of the MLMC to jump-diffusion SDEs with state-dependent intensity. In our case, we need a change of probability which guarantees not only that the processes jump at the same times but also in the same states.

Proposition 2.2.

Let us denote by $(\Phi,\lambda,Q)$ ( resp. $(\Phi,\tilde{\lambda},\tilde{Q})$ ) the characteristics of $(x_{t})$ (resp. $(\tilde{x}_{t})$ ). Let us assume that $\tilde{\lambda}$ and $\tilde{Q}$ depend only on $\theta$ , that $\tilde{Q}$ is always positive and $0<\tilde{\lambda}(\theta)<\lambda^{*}$ for all $\theta\in\Theta$ . For all integer $n$ , let us define on the event $\{\tilde{N}_{t}=n\},$

[TABLE]

the product being equal to $1$ if $\tilde{\tau}_{n}=N_{t}^{*}$ and for all $1\leq\ell\leq n-1,$

[TABLE]

Then, for all $n\geq 0$ we have

[TABLE]

Corollary 2.1.

Under the assumptions of Proposition 2.2, setting $\tilde{R}_{t}=\tilde{R}_{\tilde{N}_{t}}$ , we have

[TABLE]

Remark 2.1.

Proposition 2.2 looks like a Girsanov theorem (see [26]) however we do not use the martingale theory here.

Remark 2.2.

We have chosen to state Proposition 2.2 with a PDP $(\tilde{x}_{t})$ whose intensity and transition measure only depend on $\theta$ for readability purposes. Actually the arguments of the proof are valid for non homogeneous intensity and transition measure of the form $\tilde{\lambda}(x,t)$ and $\tilde{Q}((x,t),dy)$ for $x=(\theta,\nu)\in E$ . A possible choice of such characteristics is $\tilde{\lambda}(x,t)=\lambda(\theta,\tilde{\Phi}_{\theta}(t,\nu))$ and $\tilde{Q}((x,t),dy)=Q((\theta,\tilde{\Phi}_{\theta}(t,\nu)),dy)$ for $\tilde{\Phi}$ a given function. This remark will be implemented in section 5.4.

Corollary 2.2.

Let $(\Phi,\lambda,Q)$ (resp. $(\tilde{\Phi},\lambda,Q)$ ) be the set of characteristics of $(x_{t})$ (resp. $(\tilde{x}_{t}))$ . We assume that $Q$ is always positive and that $0<\lambda(x)<\lambda^{*}$ for all $x\in E$ . Let $(\mu_{n})$ be the sequence defined by $\mu_{0}=\nu$ and $\mu_{n}=\tilde{\Phi}_{\theta_{n-1}}(T_{n}-T_{n-1},\mu_{n-1})$ for $n\geq 1$ . For all integer $n$ , let us define on the event $\{N_{t}=n\}$ ,

[TABLE]

the products being equal to $1$ if $\tau_{n}=N_{t}^{*}$ and for all $1\leq\ell\leq n-1,$

[TABLE]

Then, for all $n\geq 0$ we have

[TABLE]

Proof of Proposition 2.1.

It holds that $\{N_{t}=n,\tau_{i}=p_{i},\,1\leq i\leq n\}\subset\{N_{t}^{*}\geq p_{n}\}$ . Then

[TABLE]

The set $\{N_{t}=n,\tau_{i}=p_{i},\,1\leq i\leq n,N_{t}^{*}=m\}$ is equivalent to the following

$N_{t}^{*}=m$ ,
among the times $T^{*}_{\ell},1\leq\ell\leq m$ exactly $n$ are accepted by the thinning method they are the $T^{*}_{p_{i}},1\leq i\leq n$ , all the others are rejected.

We proceed by induction starting from the fact that all the $T^{*}_{q},\,p_{n}+1\leq q\leq m$ are rejected which corresponds to the event

[TABLE]

The random variable $\mathds{1}_{\{\tau_{i}=p_{i},\,1\leq i\leq n\}}$ depends on $(\theta_{\ell},\nu_{\ell},1\leq\ell\leq n-1,T^{*}_{i},1\leq i\leq p_{n},U_{j},1\leq j\leq p_{n})$ where by construction $\nu_{\ell}=\phi_{\theta_{\ell-1}}(T^{*}_{p_{\ell}}-T^{*}_{p_{\ell-1}},\nu_{\ell-1})$ , $\theta_{\ell}=H((\theta_{\ell-1},\nu_{\ell}),V_{\ell})$ which implies that $(\theta_{\ell},\nu_{\ell},1\leq\ell\leq n-1)$ depend on $(T^{*}_{i},1\leq i\leq p_{n-1},U_{j},1\leq j\leq p_{n-1},V_{k},1\leq k\leq n-1)$ . Thus $V_{n}$ is independent of all the other random variables of thinning that are present in $g(x_{t})\mathds{1}_{\{N_{t}=n,\tau_{i}=p_{i},\,1\leq i\leq n,\,N_{t}^{*}=m\}}$ . The conditional expectation of $g(x_{t})\mathds{1}_{\{N_{t}=n,\tau_{i}=p_{i},\,1\leq i\leq n,N_{t}^{*}=m\}}$ w.r.t. the vector $(T^{*}_{i},1\leq i\leq m+1,U_{j},1\leq j\leq m,V_{k},1\leq k\leq n-1)$ is therefore an expectation indexed by this vector as parameters. Since the law of $H(x,V_{n})$ is $Q(x,\cdot)$ for all $x\in E$ we obtain for $p_{1}<p_{2}<...<p_{n}\leq m$ ,

[TABLE]

with

[TABLE]

In (19) the random variables $(U_{q},\,p_{n}+1\leq q\leq m)$ are independent of the vector $(T^{*}_{i},1\leq i\leq m+1,U_{j},1\leq j\leq p_{n},V_{k},1\leq k\leq n-1)$ . Conditioning by this vector we obtain

[TABLE]

We can iterate on the latter form by first conditioning $V_{n-1}$ by all the other r.v. and then conditioning $(U_{q},\,p_{n-1}+1\leq q\leq p_{n})$ by all the remaining ones and so on. However the terms that appear do not have the same structure since the $U_{q}$ correspond to a rejection for $p_{n-1}+1\leq q\leq p_{n}-1$ whereas $U_{p_{n}}$ corresponds to an acceptation. So that the next step yields

[TABLE]

where we write $\nu_{n}$ for simplicity keeping in mind that $\nu_{n}=\Phi_{\theta_{n-1}}(T^{*}_{p_{n}}-T^{*}_{p_{n-1}},\nu_{n-1})=\Phi_{\theta_{n-1}}(T^{*}_{p_{n}}-T^{*}_{p_{n-1}},\Phi_{\theta_{n-2}}(T^{*}_{p_{n-1}}-T^{*}_{p_{n-2}},\nu_{n-2}))=\Phi_{\alpha}(T^{*}_{p_{n}}-T^{*}_{p_{n-1}},\Phi_{\theta_{n-2}}(T^{*}_{p_{n-1}}-T^{*}_{p_{n-2}},\nu_{n-2}))$ .

Moreover the previous arguments apply to $\mathbb{E}(g(x_{t})f(\theta_{i},\nu_{i},1\leq i\leq n-1,\theta_{n},\nu_{n},T^{*}_{k},1\leq k\leq m)\,\mathds{1}_{\{N_{t}=n,\tau_{i}=p_{i},\,1\leq i\leq n,\,N_{t}^{*}=m\}})$ and provide

[TABLE]

∎

We prove below Proposition 2.2. The other statements can be proved analogously.

Proof of Proposition 2.2.

By assumption the (jump) characteristics $(\tilde{\lambda},\tilde{Q})$ of $(\tilde{x}_{t})$ depend only on $\theta$ . Let $p_{1}<p_{2}<...<p_{n}\leq m$ . Applying the same arguments as in (21) to $(\tilde{x}_{t})$ and using the definitions of $\tilde{Z}_{\ell},\,0\leq\ell\leq n$ and $\tilde{R}_{n}$ we obtain,

[TABLE]

We iterate the previous argument based on the use of (21) and we use the definition of $\tilde{Z}_{n-1}$ to obtain

[TABLE]

where for short $\tilde{\nu}_{n}=\phi_{\alpha}(T^{*}_{p_{n}}-T^{*}_{p_{n-1}},\tilde{\nu}_{n-1})$ and $\tilde{\nu}_{n-1}=\phi_{\tilde{\theta}_{n-2}}(T^{*}_{p_{n-1}}-T^{*}_{p_{n-2}},\tilde{\nu}_{n-2})$ . Comparing the latter expression to (20) and using an induction we conclude that

[TABLE]

It remains to sum up on $p_{i},1\leq i\leq n$ and $m$ . ∎

3 Strong error estimates

In this section we are interested in strong error estimates. Below, we state the main assumptions and theorems of this section, the proofs are given in sections 3.2, 3.3 respectively.

Assumption 3.1.

For all $\theta\in\Theta$ and for all $A\in\mathcal{B}(\Theta)$ , the functions $\nu\mapsto\lambda(\theta,\nu)$ and $\nu\mapsto Q((\theta,\nu),A)$ are Lipschitz with constants $L_{\lambda}>0$ , $L_{Q}>0$ respectively independent of $\theta$ .

Theorem 3.1.

Let $\Phi_{\theta}$ and $\overline{\Phi}_{\theta}$ satisfying (14) and let $(x_{t},t\in[0,T])$ and $(\overline{x}_{t},t\in[0,T])$ be the corresponding PDPs constructed in section 2.1 with $x_{0}=\overline{x}_{0}=x$ for some $x\in E$ . Assume that $\Theta$ is finite and that $\lambda$ and $Q$ satisfy Assumption 3.1. Then, for all bounded functions $F:E\rightarrow\mathbb{R}$ such that for all $\theta\in\Theta$ the function $\nu\mapsto F(\theta,\nu)$ is $L_{F}$ -Lipschitz where $L_{F}$ is positive and independent of $\theta$ , there exists constants $V_{1}>0$ and $V_{2}>0$ independent of the time step $h$ such that

[TABLE]

Remark 3.1.

When the numerical scheme $\overline{\Phi}_{\theta}$ is of order $p\geq 1$ , which means $\sup_{t\in[0,T]}|\Phi_{\theta}(t,\nu_{1})-\overline{\Phi}_{\theta}(t,\nu_{2})|\leq e^{C_{1}T}|\nu_{1}-\nu_{2}|+C_{2}h^{p}$ we have $\mathbb{E}\left[|F(\overline{x}_{T})-F(x_{T})|^{2}\right]\leq V_{1}h^{p}+V_{2}h^{2p}$ .

Assumption 3.2.

There exist positive constants $\rho$ , $\tilde{\lambda}_{\min}$ , $\tilde{\lambda}_{\max}$ such that for all $(i,j)\in\Theta^{2}$ , $\rho\leq\tilde{Q}(i,j)$ and $\tilde{\lambda}_{\min}\leq\tilde{\lambda}(i)\leq\tilde{\lambda}_{\max}<\lambda^{*}$ .

Theorem 3.2.

Let $\Phi_{\theta}$ and $\overline{\Phi}_{\theta}$ satisfying (14) and let $(\tilde{x}_{t},t\in[0,T])$ and $(\underline{\tilde{x}}_{t},t\in[0,T])$ be the corresponding PDPs constructed in section 2.1 with $\underline{\tilde{x}}_{0}=\tilde{x}_{0}=x$ for some $x\in E$ . Let $(\tilde{R}_{t},t\in[0,T])$ and $(\underline{\tilde{R}}_{t},t\in[0,T])$ be defined as in Corollary 2.1. Under assumptions 3.1 and 3.2 and for all bounded functions $F:E\rightarrow\mathbb{R}$ such that for all $\theta\in\Theta$ the function $\nu\mapsto F(\theta,\nu)$ is $L_{F}$ -Lipschitz ( $L_{F}>0$ ), there exists a positive constant $\tilde{V}_{1}$ independent of the time step $h$ such that

[TABLE]

where $\tilde{R}_{T}$ has been defined in Corollary 2.1.

We now introduce the random variable $\overline{\tau}^{\dagger}$ which will play an important role in the strong error estimate of Theorem 3.1 as well as in the identification of the coefficient $c_{1}$ in the weak error expansion in section 4 (see the proof of Theorem 4.1 in section 4.2).

Definition 3.1.

Let us define $\overline{\tau}^{\dagger}:=\inf\left\{k>0:(\tau_{k},\theta_{k})\neq(\overline{\tau}_{k},\overline{\theta}_{k})\right\}$ .

The random variable $\overline{\tau}^{\dagger}$ enables us to partition the trajectories of the couple $(x_{t},\overline{x}_{t})$ in a sense that we precise now. Consider the event

[TABLE]

where $(T_{n})$ and $(\overline{T}_{n})$ denote the sequences of jump times of $(x_{t})$ and $(\overline{x}_{t})$ . On this event $\{\min(T_{\overline{\tau}^{\dagger}},\overline{T}_{\overline{\tau}^{\dagger}})>T\}$ the trajectories of the discrete time processes $(T_{n},\theta_{n})$ and $(\overline{T}_{n},\overline{\theta}_{n})$ are equal for all $n$ such that $T_{n}\in[0,T]$ (or equivalently $\overline{T}_{n}\in[0,T]$ ). Moreover the complement i.e $\{\min(T_{\overline{\tau}^{\dagger}},\overline{T}_{\overline{\tau}^{\dagger}})\leq T\}$ contains the trajectories for which $(T_{n},\theta_{n})$ and $(\overline{T}_{n},\overline{\theta}_{n})$ differ on $[0,T]$ (there exits $n\leq N_{T}\vee\overline{N}_{T}$ such that $T_{n}\neq\overline{T}_{n}$ or $\theta_{n}\neq\overline{\theta}_{n}$ ).

3.1 Preliminary lemmas

In this section we start with two lemmas which will be useful to prove Theorems 3.2 and 3.3.

Lemma 3.1.

Let $K$ be a finite set. We denote by $|K|$ the cardinal of $K$ and for $i=1,\ldots,|K|$ we denote by $k_{i}$ its elements. Let $(p_{i},1\leq i\leq|K|)$ and $(\overline{p}_{i},1\leq i\leq|K|)$ be two probabilities on $K$ . Let $a_{j}:=\sum_{i=1}^{j}p_{i}$ and $\overline{a}_{j}:=\sum_{i=1}^{j}\overline{p}_{i}$ for all $j\in\{1,\ldots,|K|\}$ . By convention, we set $a_{0}=\overline{a}_{0}:=0$ . Let $X$ and $\overline{X}$ be two $K$ -valued random variables defined by

[TABLE]

where $U\sim\mathcal{U}([0,1])$ , $G(u)=\sum_{j=1}^{|K|}k_{j}\mathds{1}_{a_{j-1}<u\leq a_{j}}$ and $\overline{G}(u)=\sum_{j=1}^{|K|}k_{j}\mathds{1}_{\overline{a}_{j-1}<u\leq\overline{a}_{j}}$ for all $u\in[0,1]$ . Then, we have

[TABLE]

Proof of Lemma 3.1.

By definition of $X$ and $\overline{X}$ and since the intervals $]a_{j-1},a_{j}]\cap]\overline{a}_{j-1},\overline{a}_{j}]$ are disjoints for $j=1,\ldots,K$ , we have

[TABLE]

Moreover, for all $1\leq j\leq|K|$ , we have

[TABLE]

Thus, denoting by $x^{+}:=\max(x,0)$ the positive part of $x\in\mathbb{R}$ and using that $x^{+}\geq x$ , we obtain

[TABLE]

Adding and subtracting $a_{j}\vee\overline{a}_{j}$ in the the above sum yields

[TABLE]

The first sum above is a telescopic sum. Since $a_{|K|}=\overline{a}_{|K|}=1$ and $a_{0}=\overline{a}_{0}=0$ , we have $\mathbb{P}(X=\overline{X})\geq 1-\sum_{j=1}^{|K|-1}|a_{j}-\overline{a}_{j}|$ .

∎

Lemma 3.2.

Let $(a_{n},n\geq 1)$ and $(b_{n},n\geq 1)$ be two real-valued sequences. For all $n\geq 1$ , we have

[TABLE]

Proof of Lemma 3.2.

By induction. ∎

3.2 Proof of Theorem 3.1

First, we write

[TABLE]

where $\overline{\tau}^{\dagger}$ is defined in Definition 3.1. The order of the term $\overline{P}$ is the order of the probability that the discrete processes $(T_{n},\theta_{n})$ and $(\overline{T}_{n},\overline{\theta}_{n})$ differ on $[0,T]$ . The order of the term $\overline{D}$ is given by the order of the Euler scheme squared because the discrete processes $(T_{n},\theta_{n})$ and $(\overline{T}_{n},\overline{\theta}_{n})$ are equal on $[0,T]$ . In the following we prove that $\overline{P}=O(h)$ and that $\overline{D}=O(h^{2})$ .

Step 1: estimation of $\overline{P}$ . The function $F$ being bounded we have $\overline{P}\leq 4M_{F}^{2}\mathbb{P}\left(\min(T_{\overline{\tau}^{\dagger}},\overline{T}_{\overline{\tau}^{\dagger}})\leq T\right)$ where $M_{F}>0$ . Moreover, for $k\geq 1$ , $\left\{\overline{\tau}^{\dagger}=k\right\}=\left\{\overline{\tau}^{\dagger}>k-1\right\}\bigcap\left\{\left(\tau_{k},\theta_{k}\right)\neq\left(\overline{\tau}_{k},\overline{\theta}_{k}\right)\right\}$ . Hence

[TABLE]

where

[TABLE]

We start with $\overline{J}_{k}$ . First note that, for $k\geq 1$ , $\{\tau_{k}=\overline{\tau}_{k}\}=\{T_{k}=\overline{T}_{k}\}$ and that on the event $\{T_{k}=\overline{T}_{k}\}$ , we have $\min(T_{k},\overline{T}_{k})=T_{k}$ , so that $\overline{J}_{k}=\mathbb{E}\left[\mathds{1}_{T_{k}\leq T}\mathds{1}_{\overline{\tau}^{\dagger}>k-1}\mathds{1}_{\tau_{k}=\overline{\tau}_{k}}\mathds{1}_{\theta_{k}\neq\overline{\theta}_{k}}\right].$ We emphasize that it makes no difference in the rest of the proof if we choose $\min(T_{k},\overline{T}_{k})=\overline{T}_{k}$ . Since $\{\overline{\tau}^{\dagger}>k-1\}=\bigcap_{i=0}^{k-1}\{(\tau_{i},\theta_{i})=(\overline{\tau}_{i},\overline{\theta}_{i})\}$ , we can rewrite $\overline{J}_{k}$ as follows

[TABLE]

By construction we have $\theta_{k}=H((\theta_{k-1},\nu_{k}),V_{k})$ and $\overline{\theta}_{k}=H((\overline{\theta}_{k-1},\overline{\nu}_{k}),V_{k})$ . The random variable $\mathds{1}_{\{\tau_{i}=\overline{\tau}_{i}=p_{i},1\leq i\leq k\}}\mathds{1}_{\{\theta_{i}=\overline{\theta}_{i}=\alpha_{i},1\leq i\leq k-1\}}\mathds{1}_{T^{*}_{p_{k}}\leq T}$ depends on the vector $(U_{i},1\leq i\leq p_{k},T^{*}_{j},1\leq j\leq p_{k},V_{q},1\leq q\leq k-1)$ which is independent of $V_{k}$ . Conditioning by this vector in (24) and applying Lemma 3.1 yields

[TABLE]

From the definition of $a_{j}$ (see (12)), the triangle inequality and since $Q$ is $L_{Q}$ -Lipschitz, we have $\sum_{j=1}^{|\Theta|-1}\left|a_{j}(\alpha_{k-1},\overline{\nu}_{k})-a_{j}(\alpha_{k-1},\nu_{k})\right|\leq\frac{\left(|\Theta|-1\right)|\Theta|}{2}L_{Q}|\overline{\nu}_{k}-\nu_{k}|$ . Since we are on the event $\{\tau_{i}=\overline{\tau}_{i}=p_{i},1\leq i\leq k\}\bigcap\{\theta_{i}=\overline{\theta}_{i}=\alpha_{i},1\leq i\leq k-1\}$ , the application of Lemma 2.1 yields $|\overline{\nu}_{k}-\nu_{k}|\leq e^{LT^{*}_{p_{k}}}kCh$ . Thus $\overline{J}_{k}\leq C_{1}h\mathbb{E}[\mathds{1}_{T_{k}\leq T}k]$ where $C_{1}$ is a constant independent of $h$ . Moreover, $\sum_{k\geq 1}\mathds{1}_{T_{k}\leq T}k=\sum_{k=1}^{N_{T}}k\leq N_{T}^{2}$ and $\mathbb{E}[N_{T}^{2}]\leq\mathbb{E}[(N^{*}_{T})^{2}]<+\infty$ so that $\sum_{k\geq 1}\overline{J}_{k}=O(h)$ . From the definition of $\overline{I}_{k}$ (see (23)), we can write

[TABLE]

The second equality above follows since $\{\tau_{k}<\overline{\tau}_{k}\}=\{T_{k}<\overline{T}_{k}\}$ and $\{\tau_{k}>\overline{\tau}_{k}\}=\{T_{k}>\overline{T}_{k}\}$ . We only treat the term $\overline{I}^{(1)}_{k}$ , the term $\overline{I}^{(2)}_{k}$ can be treated similarly by interchanging the role of $(\tau_{k},T_{k})$ and $(\overline{\tau}_{k},\overline{T}_{k})$ . Just as in the previous case, we can rewrite $\overline{I}^{(1)}_{k}$ as follows

[TABLE]

In (25) we have $\{\tau_{k}=p_{k}\}\cap\{p_{k}<\overline{\tau}_{k}\}\subseteq\{\lambda(\alpha_{k-1},\overline{\Phi}_{\alpha_{k-1}}(T^{*}_{p_{k}}-T^{*}_{p_{k-1}},\overline{\nu}_{k-1}))<U_{p_{k}}\lambda^{*}\leq\lambda(\alpha_{k-1},\Phi_{\alpha_{k-1}}(T^{*}_{p_{k}}-T^{*}_{p_{k-1}},\nu_{k-1}))\}$ . The random variable $\mathds{1}_{\{\tau_{i}=\overline{\tau}_{i}=p_{i},1\leq i\leq k-1\}}$ $\mathds{1}_{\{\theta_{i}=\overline{\theta}_{i}=\alpha_{i},1\leq i\leq k-1\}}$ $\mathds{1}_{T^{*}_{p_{k}}\leq T}$ depends on $(U_{i},1\leq i\leq p_{k-1},T^{*}_{j},1\leq j\leq p_{k},V_{q},1\leq q\leq k-1)$ which is independent of $U_{p_{k}}$ . Conditioning by this vector in (25) yields

[TABLE]

Using the Lipschitz continuity of $\lambda$ then Lemma 2.1 we get that $\overline{I}^{(1)}_{k}\leq C_{2}h\mathbb{E}[\mathds{1}_{T_{k}\leq T}k]$ where $C_{2}$ is a constant independent of $h$ . Concerning the term $\overline{I}^{(2)}_{k}$ , we will end with the estimate $\overline{I}^{(2)}_{k}\leq C_{2}h\mathbb{E}[\mathds{1}_{\overline{T}_{k}\leq T}k]$ . We conclude in the same way as in the estimation of $\overline{J}_{k}$ above that $\sum_{k\geq 1}\overline{I}_{k}=O(h)$ .

Step 2: estimation of $\overline{D}$ . Note that for $n\geq 0$ we have $\{N_{T}=n\}\cap\{\min(T_{\overline{\tau}^{\dagger}},\overline{T}_{\overline{\tau}^{\dagger}})>T\}=\{N_{T}=n\}\cap\{\overline{N}_{T}=n\}\cap\{\overline{\tau}^{\dagger}>n\},$ where we can interchange the role of $\{N_{T}=n\}$ and $\{\overline{N}_{T}=n\}$ . Thus, using the partition $\{N_{T}=n,n\geq 0\}$ , we have

[TABLE]

The application of the Lipschitz continuity of $F$ and of Lemma 2.1 yields

[TABLE]

Then, we have $\overline{D}\leq C_{3}h^{2}\sum_{n\geq 0}\mathbb{E}\left[\mathds{1}_{N_{T}=n}(n+1)^{2}\right]$ where $C_{3}$ is a constant independent of $h$ . Since $\sum_{n\geq 0}\mathbb{E}\left[\mathds{1}_{N_{T}=n}(n+1)^{2}\right]=\mathbb{E}[(N_{T}+1)^{2}]\leq\mathbb{E}[(N^{*}_{T}+1)^{2}]<+\infty$ , we conclude that $\overline{D}=O(h^{2})$ .

∎

3.3 Proof of Theorem 3.2

First we reorder the terms in $\tilde{R}_{T}$ . We write $\tilde{R}_{T}=\tilde{\mathsf{Q}}_{T}\tilde{\mathsf{S}}_{T}\tilde{\mathsf{H}}_{T}$ where

[TABLE]

Likewise we reorder the terms in $\underline{\tilde{R}}_{T}$ writing $\underline{\tilde{R}}_{T}=\underline{\tilde{\mathsf{Q}}}_{T}\underline{\tilde{\mathsf{S}}}_{T}\tilde{\mathsf{H}}_{T}$ where $\underline{\tilde{\mathsf{Q}}}_{T}$ and $\underline{\tilde{\mathsf{S}}}_{T}$ are defined as (26) and (27) replacing $\tilde{x}$ and $\Phi$ by $\underline{\tilde{x}}$ and $\overline{\Phi}$ . Since the processes $(\tilde{\theta}_{n})$ and $(\tilde{\tau}_{n})$ do not depend on $\Phi$ or $\overline{\Phi}$ , the term $\tilde{\mathsf{H}}$ is the same in $\tilde{R}$ and $\underline{\tilde{R}}$ . To prove Theorem 3.2, let us decompose the problem and write

[TABLE]

so that

[TABLE]

In the following we show that $\overline{C}=O(h^{2})$ and that $\overline{D}=O(h^{2})$ .

Step 1: estimation of $\overline{C}$ . The function $F$ being bounded we have $\overline{C}\leq M_{F}^{2}\mathbb{E}\left[|\underline{\tilde{R}}_{T}-\tilde{R}_{T}|^{2}\right]$ where $M_{F}$ is a positive constant. Moreover, for all $\theta\in\Theta$ , we have $(1-\tilde{\lambda}(\theta)/\lambda^{*})^{-1}\leq(1-\tilde{\lambda}_{\text{max}}/\lambda^{*})^{-1}$ and $(\tilde{\lambda}(\theta)/\lambda^{*})^{-1}\leq(\tilde{\lambda}_{\text{min}}/\lambda^{*})^{-1}$ . Thus, $\tilde{H}_{T}\leq\left(\frac{\tilde{\lambda}_{\text{min}}}{\lambda^{*}}(1-\frac{\tilde{\lambda}_{\text{max}}}{\lambda^{*}})\right)^{-N^{*}_{T}}$ and using the definition of $\tilde{R}$ and $\underline{\tilde{R}}$ (see (26), (27) and (28)) we can write

[TABLE]

We set $\overline{J}=|\underline{\tilde{\mathsf{Q}}}_{T}-\tilde{\mathsf{Q}}_{T}|\tilde{\mathsf{S}}_{T}$ and $\overline{I}=|\underline{\tilde{\mathsf{S}}}_{T}-\tilde{\mathsf{S}}_{T}|\underline{\tilde{\mathsf{Q}}}_{T}$ . To provide the desired estimate for $\overline{C}$ , we proceed as follows. First, we work $\omega$ by $\omega$ to determine (random) bounds for $\overline{J}$ and $\overline{I}$ from which we deduce a (random) bound for $|\underline{\tilde{R}}_{T}-\tilde{R}_{T}|$ . Finally, we take the expectation. We start with $\overline{I}$ . For all $(\theta,\nu)\in E$ and for all $t\geq 0$ we have, from Assumption 2.1, that $1-\lambda(\theta,\Phi_{\theta}(t,\nu))/\lambda^{*}\leq 1$ and $\lambda(\theta,\Phi_{\theta}(t,\nu))/\lambda^{*}\leq 1$ . Then, using Lemma 3.2 (twice) we have

[TABLE]

Using the Lipschitz continuity of $\lambda$ and Lemma 2.1, we find that, for all $l=1,\ldots,\tilde{N}_{T}+1$ and $k=\tilde{\tau}_{l-1}+1,\ldots,\tilde{\tau}_{l}\wedge N^{*}_{T}$ ,

[TABLE]

Moreover, for all $l=1,\ldots,\tilde{N}_{T}+1$ we have $\tilde{\tau}_{l}\wedge N^{*}_{T}-\tilde{\tau}_{l-1}\leq N^{*}_{T}$ so that $|\underline{\tilde{\mathsf{S}}}_{T}-\tilde{\mathsf{S}}_{T}|\leq N^{*}_{T}(N^{*}_{T}+1)^{2}C_{1}h$ where $C_{1}$ is a positive constant independent of $h$ . Finally, since $\underline{\tilde{\mathsf{Q}}}_{T}\leq\rho^{-N^{*}_{T}}$ we have

[TABLE]

Now, consider $\overline{J}$ . Note that from Assumption 2.1 we have $\tilde{\mathsf{S}}_{T}\leq 1$ . We use the same type of arguments as for $\overline{I}$ . That is, we successively use Lemma 3.2, the Lipschitz continuity of $Q$ and Lemma 2.1 to obtain

[TABLE]

where $C_{2}$ is a positive constant independent of $h$ . Then, we derive from the previous estimates (29) and (30) that

[TABLE]

where $\Xi_{1}(n)=\left(\rho\frac{\tilde{\lambda}_{\text{min}}}{\lambda^{*}}(1-\frac{\tilde{\lambda}_{\text{max}}}{\lambda^{*}})\right)^{-n}n(n+1)^{2}$ and $C_{3}=\max(C_{1},C_{2})$ . Finally, we have $\mathbb{E}[|\underline{\tilde{R}}_{T}-\tilde{R}_{T}|^{2}]\leq C_{3}h^{2}\mathbb{E}[\Xi_{1}(N^{*}_{T})^{2}]$ . Since $\mathbb{E}[\Xi_{1}(N^{*}_{T})^{2}]<+\infty$ we conclude that $\overline{C}=O(h^{2})$ .

Step 2: estimation of $\overline{D}$ . Recall that $\tilde{x}_{T}=(\tilde{\theta}_{\tilde{N}_{T}},\Phi_{\tilde{\theta}_{\tilde{N}_{T}}}(T-\tilde{T}_{\tilde{N}_{T}},\tilde{\nu}_{\tilde{N}_{T}}))$ and $\underline{\tilde{x}}_{T}=(\tilde{\theta}_{\tilde{N}_{T}},\overline{\Phi}_{\tilde{\theta}_{\tilde{N}_{T}}}(T-\tilde{T}_{\tilde{N}_{T}},\underline{\tilde{\nu}}_{\tilde{N}_{T}}))$ . Then, using the Lipschitz continuity of $F$ , Lemma 2.1 and since $\tilde{N}_{T}\leq N^{*}_{T}$ we get

[TABLE]

Moreover, $|\underline{\tilde{R}}_{T}|\leq\left(\rho\frac{\tilde{\lambda}_{\text{min}}}{\lambda^{*}}(1-\frac{\tilde{\lambda}_{\text{max}}}{\lambda^{*}})\right)^{-N^{*}_{T}}$ so that $\overline{D}\leq C_{4}h^{2}\mathbb{E}[\Xi_{2}(N^{*}_{T})^{2}]$ where $C_{4}$ is a positive constant independent of $h$ and $\Xi_{2}(n)=(n+1)\left(\rho\frac{\tilde{\lambda}_{\text{min}}}{\lambda^{*}}(1-\frac{\tilde{\lambda}_{\text{max}}}{\lambda^{*}})\right)^{-n}$ . Since $\mathbb{E}[\Xi_{2}(N^{*}_{T})^{2}]<+\infty$ we conclude that $\overline{D}=O(h^{2})$ .

∎

4 Weak error expansion

In this section we are interested in a weak error expansion for the PDMP $(x_{t})$ of section 2.3 and its associated Euler scheme $(\overline{x}_{t})$ . First of all, we recall from [5] that the generator $\mathcal{A}$ of the process $(t,x_{t})$ which acts on functions $g$ defined on $\mathbb{R}_{+}\times E$ is given by

[TABLE]

where for notational convenience we have set $\partial_{\nu}g(t,x):=\frac{\partial g}{\partial\nu}(t,\theta,\nu)$ , $\partial_{t}g(t,x):=\frac{\partial g}{\partial t}(t,x)$ and $f(x)=f_{\theta}(\nu)$ for all $x=(\theta,\nu)\in E$ . Below, we state the assumptions and the main theorem of this section. Its proof which is inspired by [27] (see also [24] or [16]) is delayed in section 4.2.

Assumption 4.1.

For all $\theta\in\Theta$ and for all $A\in\mathcal{B}(\Theta)$ , the functions $\nu\mapsto Q\left((\theta,\nu),A\right)$ , $\nu\mapsto\lambda\left(\theta,\nu\right)$ and $\nu\mapsto f_{\theta}\left(\nu\right)$ are bounded and twice continuously differentiable with bounded derivatives.

Assumption 4.2.

The solution $u$ of the integro differential equation

[TABLE]

with $F:E\rightarrow\mathbb{R}$ a bounded function and $\mathcal{A}$ given by (31) is such that for all $\theta\in\Theta$ , the function $(t,\nu)\mapsto u(t,\theta,\nu)$ is bounded and two times differentiable with bounded derivatives. Moreover the second derivatives of $(t,\nu)\mapsto u(t,\theta,\nu)$ are uniformly Lipschitz in $\theta$ .

Theorem 4.1.

Let $(x_{t},t\in[0,T])$ be a PDMP and $(\overline{x}_{t},t\in[0,T])$ its approximation constructed in section 2.3 with $x_{0}=\overline{x}_{0}=x$ for some $x\in E$ . Under assumptions 4.1. and 4.2. for any bounded function $F:E\rightarrow\mathbb{R}$ there exists a constant $c_{1}$ independent of $h$ such that

[TABLE]

Remark 4.1.

If $(\tilde{x}_{t})$ is a PDMP whose characteristics $\tilde{\lambda},\tilde{Q}$ satisfy the assumptions of Proposition 2.2 and $(\underline{\tilde{x}}_{t})$ is its approximation we deduce from Theorem 4.1 that

[TABLE]

4.1 Further results on PDMPs: Itô and Feynman-Kac formulas

Definition 4.1.

Let us define the following operators which act on functions $g$ defined on $\mathbb{R}_{+}\times E$ .

[TABLE]

From Definition 4.1, the generator $\mathcal{A}$ defined by (31) reads $\mathcal{A}g(t,x)=\mathcal{T}g(t,x)+\mathcal{S}g(t,x)$ . We introduce the random counting measure $p$ associated to the PDMP $(x_{t})$ defined by $p([0,t]\times A):=\sum_{n\geq 1}\mathds{1}_{T_{n}\leq t}\mathds{1}_{Y_{n}\in A}$ for $t\in[0,T]$ and for $A\in\mathcal{B}(E)$ . The compensator of $p$ , noted $p^{\prime}$ , is given from [5] by

[TABLE]

Hence, $q:=p-p^{\prime}$ is a martingale with respect to the filtration generated by $p$ noted $(\mathcal{F}_{t}^{p})_{t\in[0,T]}$ . Similarly, we introduce $\overline{p}$ , $\overline{p}^{\prime}$ , $\overline{q}$ and $(\mathcal{F}_{t}^{\overline{p}})_{t\in[0,T]}$ to be the same objects as above but corresponding to the approximation $(\overline{x}_{t})$ . The fact that $\overline{p}^{\prime}$ is the compensator of $\overline{p}$ and that $\overline{q}$ is a martingale derives from arguments of the marked point processes theory, see [4].

Definition 4.2.

Let us define the following operators which act on functions $g$ defined on $\mathbb{R}_{+}\times E$ .

[TABLE]

Remark 4.2.

For all functions $g$ defined on $\mathbb{R}_{+}\times E$ , $\overline{\mathcal{T}}g(t,x,x)=\mathcal{T}g(t,x)$ , so that $\overline{\mathcal{A}}g(t,x,x)=\mathcal{A}g(t,x)$ .

The next theorem provides Itô formulas for the PDMP $(x_{t})$ and its approximation $(\overline{x}_{t})$ . For all $s\in[0,T]$ , we set $\overline{\eta}(s):=\overline{T}_{n}+kh$ if $s\in[\overline{T}_{n}+kh,(\overline{T}_{n}+(k+1)h)\wedge\overline{T}_{n+1}[$ for some $n\geq 0$ and for some $k\in\{0,\ldots,\lfloor(\overline{T}_{n+1}-\overline{T}_{n})/h\rfloor\}$ .

Theorem 4.2.

Let $(x_{t},t\in[0,T])$ and $(\overline{x}_{t},t\in[0,T])$ be a PDMP and its approximation respectively constructed in section 2.3 with $x_{0}=\overline{x}_{0}=x$ for some $x\in E$ . For all bounded functions $g:\mathbb{R}_{+}\times E\rightarrow\mathbb{R}$ continuously differentiable with bounded derivatives, we have

[TABLE]

where $M^{g}_{t}:=\int_{0}^{t}\int_{E}(g(s,y)-g(s,x_{s-}))q(dsdy)$ is a true $\mathcal{F}_{t}^{p}$ -martingale, and

[TABLE]

where, $\overline{M}^{g}_{t}:=\int_{0}^{t}\int_{E}(g(s,y)-g(s,\overline{x}_{s-}))\overline{q}(dsdy)$ is a true $\mathcal{F}_{t}^{\overline{p}}$ -martingale.

Proof of Theorem 4.2.

The proof of (35) is given in [5]. We prove (36) following the same arguments. Since $\overline{q}=\overline{p}-\overline{p}^{\prime}$ , we have

[TABLE]

Consider the above sum. As in [5], we write, on the event $\{\overline{N}_{t}=n\}$ , that

[TABLE]

For all $k\leq n-1$ , we decompose the increment $g(\overline{T}_{k+1},\overline{x}_{\overline{T}_{k+1}}^{-})-g(\overline{T}_{k},\overline{x}_{\overline{T}_{k}})$ as a sum of increments on the intervals $[\overline{T}_{k}+ih,(\overline{T}_{k}+(i+1)h)\wedge\overline{T}_{k+1}]\subset[\overline{T}_{k},\overline{T}_{k+1}]$ . Without loss of generality we are led to consider increments of the form $g(t,\theta,\overline{\phi}_{\theta}(t,\nu))-g(ih,\theta,\overline{y}_{i}(x))$ for some $i\geq 0$ , $t\in[ih,(i+1)h]$ and for all $x=(\theta,\nu)\in E$ where we recall that $\overline{\phi}$ is defined by (16). The function $g$ is smooth enough to write

[TABLE]

Then, the above arguments together with definition 4.2 yields

[TABLE]

∎

The following theorem gives us a way to represent the solution of the integro-differential equation (32) as the conditional expected value of a functional of the terminal value of the PDMP $(x_{t})$ . It plays a key role in the proof of Theorem 4.1.

Theorem 4.3 (PDMP’s Feynman-Kac formula [6]).

Let $F:E\rightarrow\mathbb{R}$ be a bounded function. Then the integro-differential equation (32) has a unique solution $u:\mathbb{R}_{+}\times E\rightarrow\mathbb{R}$ given by

[TABLE]

4.2 Proof of Theorem 4.1

We provide a proof in two steps. First, we give an appropriate representation of the weak error $\mathbb{E}[F(\overline{x}_{T})]-\mathbb{E}[F(x_{T})]$ . Then, we use this representation to identify the coefficient $c_{1}$ in (33).

Step 1: Representing $\mathbb{E}[F(\overline{x}_{T})]-\mathbb{E}[F(x_{T})]$ . Let $u$ denote the solution of (32). From Theorem 4.3 we can write $\mathbb{E}[F(\overline{x}_{T})]-\mathbb{E}[F(x_{T})]=\mathbb{E}[u(T,\overline{x}_{T})]-u(0,x)$ . Then, the application of the Itô formula (36) to $u$ at time $T$ yields

[TABLE]

Since $(\overline{M}^{u}_{t})$ is a true martingale, we obtain

[TABLE]

For $s\in[0,T]$ we have $\overline{\mathcal{A}}u(s,\overline{x}_{s},\overline{x}_{\overline{\eta}(s)})=\partial_{t}u(s,\overline{x}_{s})+f(\overline{x}_{\overline{\eta}(s)})\partial_{\nu}u(s,\overline{x}_{s})+\mathcal{S}u(s,\overline{x}_{s})$ (see Definition 4.2). From the regularity of $\lambda$ , $Q$ and $u$ (see assumptions 4.1 and 4.2), the functions $\partial_{t}u$ , $\partial_{\nu}u$ and $\mathcal{S}u$ are smooth enough to apply the Itô formula (36) between $\overline{\eta}(s)$ and $s$ respectively. This yields

[TABLE]

Moreover, since $\overline{\eta}(r)=\overline{\eta}(s)$ for $r\in[\overline{\eta}(s),s]$ , we have

[TABLE]

so that

[TABLE]

where,

[TABLE]

Since $\overline{\mathcal{A}}u(t,x,x)=\mathcal{A}u(t,x)$ , the first term in the above equality is 0 by Theorem 4.3. By using Fubini’s theorem and the fact that $(\overline{M}^{\partial_{t}u}_{t})$ and $(\overline{M}^{\mathcal{S}u}_{t})$ are true martingales, we obtain

[TABLE]

Moreover, since $(\overline{M}^{\partial_{\nu}u}_{t})$ is a $\mathcal{F}^{\overline{p}}_{t}$ -martingale, we have

[TABLE]

Collecting the previous results, we obtain $\mathbb{E}[F(\overline{x}_{T})]-\mathbb{E}[F(x_{T})]=\mathbb{E}\left[\int_{0}^{T}\int_{\overline{\eta}(s)}^{s}\Upsilon(r,\overline{x}_{r},\overline{x}_{\overline{\eta}(r)})drds\right].$ We can compute an explicit form of $\Upsilon$ in term of $u$ , $f$ , $\lambda$ , $Q$ and their derivatives. Indeed, $\Upsilon$ is given by (37), and we have

[TABLE]

The application of the Taylor formula to the functions $\partial^{2}_{tt}u$ , $\partial^{2}_{t\nu}u$ , $\partial^{2}_{\nu\nu}u$ , $\mathcal{S}(\partial_{t}u)$ , $\mathcal{S}(\partial_{\nu}u)$ , $\partial_{t}(\mathcal{S}u)$ , $\partial_{\nu}(\mathcal{S}u)$ and $\mathcal{S}(\mathcal{S}u)$ at the order 0 around $(\overline{\eta}(r),\overline{x}_{\overline{\eta}(r)})$ yields $\Upsilon(r,\overline{x}_{r},\overline{x}_{\overline{\eta}(r)})=\Upsilon(\overline{\eta}(r),\overline{x}_{\overline{\eta}(r)},\overline{x}_{\overline{\eta}(r)})+O(h).$ Setting $\Psi(t,x)=\Upsilon(t,x,x)$ and recalling that for $r\in[\overline{\eta}(s),s]$ , $\overline{\eta}(r)=\overline{\eta}(s)$ and that $|s-\overline{\eta}(s)|\leq h$ , we obtain

[TABLE]

Consider the expectation in the right-hand side of the above equality. We decompose the integral into a (finite) sum of integrals on the intervals $[\overline{T}_{n}+kh,(\overline{T}_{n}+(k+1)h)\wedge\overline{T}_{n+1}]$ where $\Psi$ is constant. Without loss of generality, we are led to consider integrals of the form $\int_{kh}^{t}(s-kh)Cds$ for some $k\geq 0$ , $t\in[kh,(k+1)h]$ and $C$ a bounded constant. We have $\int_{kh}^{t}(s-kh)Cds=\frac{t-kh}{2}\int_{kh}^{t}Cds$ moreover adding and subtracting $h$ in the numerator of $(t-kh)/2$ yields

[TABLE]

Since $C$ is bounded we deduce that $\int_{kh}^{t}(s-kh)Cds=\frac{h}{2}\int_{kh}^{t}Cds+O(h^{2})$ . Since $\Psi$ is assumed bounded and $\mathbb{E}[\overline{N}_{T}]<+\infty$ , the above arguments yields the following representation

[TABLE]

Step 2: From the representation (38) to the expansion at the order one. In this step, we show that $\mathbb{E}\left[\int_{0}^{T}\Psi(\overline{\eta}(s),\overline{x}_{\overline{\eta}(s)})ds\right]=\mathbb{E}\left[\int_{0}^{T}\Psi(s,x_{s})ds\right]+O(h)$ . First, we introduce the random variables $\overline{\Gamma}$ and $\Gamma$ defined by $\overline{\Gamma}:=\int_{0}^{T}\Psi(\overline{\eta}(s),\overline{x}_{\overline{\eta}(s)})ds$ and $\Gamma:=\int_{0}^{T}\Psi(\overline{\eta}(s),x_{\overline{\eta}(s)})ds$ and write

[TABLE]

where $\overline{\tau}^{\dagger}$ is defined in Definition 3.1. Since $\Psi$ is bounded and $\mathbb{P}(\min(T_{\overline{\tau}^{\dagger}},\overline{T}_{\overline{\tau}^{\dagger}})\leq T)=O(h)$ (see the proof of Theorem 3.1), we have $\mathbb{E}\left[|\overline{\Gamma}-\Gamma|\mathds{1}_{\min(T_{\overline{\tau}^{\dagger}},\overline{T}_{\overline{\tau}^{\dagger}})\leq T}\right]=O(h).$ Now, recall from (22) that, on the event $\{\min(T_{\overline{\tau}^{\dagger}},\overline{T}_{\overline{\tau}^{\dagger}})>T\}$ , we have $T_{k}=\overline{T}_{k}$ and $\theta_{k}=\overline{\theta}_{k}$ for all $k\geq 1$ such that $T_{k}\in[0,T]$ . Thus, for all $n\leq\overline{N}_{T}$ and for all $s\in[\overline{T}_{n},\overline{T}_{n+1}[$ we have $\overline{x}_{\overline{\eta}(s)}=(\overline{\theta}_{n},\overline{\phi}_{\overline{\theta}_{n}}(\overline{\eta}(s)-\overline{T}_{n},\overline{\nu}_{n}))$ and $x_{\overline{\eta}(s)}=(\overline{\theta}_{n},\phi_{\overline{\theta}_{n}}(\overline{\eta}(s)-\overline{T}_{n},\nu_{n}))$ . Consequently, on the event $\{\min(T_{\overline{\tau}^{\dagger}},\overline{T}_{\overline{\tau}^{\dagger}})>T\}$ we have

[TABLE]

From the regularity assumptions 4.1 and 4.2, the function $\nu\mapsto\Psi(t,\theta,\nu)$ is uniformly Lipschitz in $(t,\theta)$ with constant $L_{\Psi}$ as sum and product of bounded Lipschitz functions. Thus, from this Lipschitz property and the application of Lemma 2.1, we get

[TABLE]

From the above inequality, we find that $\mathbb{E}\left[\mathds{1}_{\min(T_{\overline{\tau}^{\dagger}},\overline{T}_{\overline{\tau}^{\dagger}})>T}|\overline{\Gamma}-\Gamma|\right]\leq L_{\Psi}Ce^{LT}Th\mathbb{E}[\overline{N}_{T}(\overline{N}_{T}+1)]$ . Since $\overline{N}_{T}\leq N^{*}_{T}$ and $\mathbb{E}[N^{*}_{T}(N^{*}_{T}+1)]<+\infty$ we conclude that $\mathbb{E}\left[\mathds{1}_{\min(T_{\overline{\tau}^{\dagger}},\overline{T}_{\overline{\tau}^{\dagger}})>T}|\overline{\Gamma}-\Gamma|\right]=O(h)$ . We have shown that $\mathbb{E}\left[\int_{0}^{T}\Psi(\overline{\eta}(s),\overline{x}_{\overline{\eta}(s)})ds\right]=\mathbb{E}\left[\int_{0}^{T}\Psi(\overline{\eta}(s),x_{\overline{\eta}(s)})ds\right]+O(h)$ . Secondly, from the regularity assumptions 4.1 and 4.2, the function $(t,\nu)\mapsto\Psi(t,\theta,\nu)$ is uniformly Lipschitz in $\theta$ . Moreover, for all $s\in[0,T]$ there exits $k\geq 0$ such that both $s$ and $\overline{\eta}(s)$ belong to the same interval $[\overline{T}_{k},\overline{T}_{k+1}[$ so that $x_{s}=(\theta_{k},\phi_{\theta_{k}}(s-\overline{T}_{k},\nu_{k}))$ and $x_{\overline{\eta}(s)}=(\theta_{k},\phi_{\theta_{k}}(\overline{\eta}(s)-\overline{T}_{k},\nu_{k}))$ . Thus, from the Lipschitz continuity of $\Psi$ , from the fact that $|s-\overline{\eta}(s)|\leq h$ and since $f_{\theta}$ is uniformly bounded in $\theta$ we have $|\Psi(s,x_{s})-\Psi(\overline{\eta}(s),x_{\overline{\eta}(s)})|\leq Ch$ where $C$ is a constant independent of $h$ . Then, we obtain $\sup_{s\in[0,T]}|\mathbb{E}[\Psi(s,x_{s})]-\mathbb{E}[\Psi(\overline{\eta}(s),x_{\overline{\eta}(s)})]|\leq Ch$ from which we deduce that $\left|\mathbb{E}\left[\int_{0}^{T}\Psi(\overline{\eta}(s),x_{\overline{\eta}(s)})ds\right]-\mathbb{E}\left[\int_{0}^{T}\Psi(s,x_{s})ds\right]\right|\leq CTh.$ Finally, the weak error expansion reads

[TABLE]

∎

5 Numerical experiment

In this section, we use the theoretical results above to apply the MLMC method to the PDMP 2-dimensional Morris-Lecar (shortened PDMP 2d-ML).

5.1 The PDMP 2-dimensional Morris-Lecar

The deterministic Morris-Lecar model has been introduced in 1981 by Catherine Morris and Harold Lecar in [23] to explain the dynamics of the barnacle muscle fiber. This model belongs to the family of conductance-based models (just as the Hodgkin-Huxley model [19]) and takes the following form

[TABLE]

where $M_{\infty}(v)=(1+\tanh[(v-V_{1})/V_{2}])/2$ , $\alpha_{\text{K}}(v)=\lambda_{\text{K}}(v)N_{\infty}(v)$ , $\beta_{\text{K}}(v)=\lambda_{\text{K}}(v)(1-N_{\infty}(v))$ , $N_{\infty}(v)=(1+\tanh[(v-V_{3})/V_{4}])/2$ , $\lambda_{\text{K}}(v)=\overline{\lambda}_{\text{K}}\cosh((v-V_{3})/2V_{4})$ .

In this section we consider the PDMP version of (39) that we denote by $(x_{t},t\in[0,T])$ , $T>0$ , whose characteristics $(f,\lambda,Q)$ are given by

•

$f(\theta,\nu)=\frac{1}{C}\Big{(}I-g_{\text{Leak}}(\nu-V_{\text{Leak}})-g_{\text{Ca}}M_{\infty}(\nu)(\nu-V_{\text{Ca}})-g_{\text{K}}\frac{\theta}{N_{\text{K}}}(\nu-V_{\text{K}})\Big{)}$ ,

•

$\lambda(\theta,\nu)=(N_{\text{K}}-\theta)\alpha_{\text{K}}(\nu)+\theta\beta_{\text{K}}(\nu)$ ,

•

$Q\Big{(}(\theta,\nu),\{\theta+1\}\Big{)}=\frac{(N_{\text{K}}-\theta)\alpha_{\text{K}}(\nu)}{\lambda(\theta,\nu)}$ , $Q\Big{(}(\theta,\nu),\{\theta-1\}\Big{)}=\frac{\theta\beta_{\text{K}}(\nu)}{\lambda(\theta,\nu)}$ .

The state space of the model is $E=\{0,\ldots,N_{\text{K}}\}\times\mathbb{R}$ where $N_{\text{K}}\geq 1$ stands for the number of potassium gates. The values of the parameters used in the simulations are $V_{1}=-1.2$ , $V_{2}=18$ , $V_{3}=2$ , $V_{4}=30$ , $\overline{\lambda}_{\text{K}}=0.04$ , $C=20$ , $g_{\text{Leak}}=2$ , $V_{\text{Leak}}=-60$ , $g_{\text{Ca}}=4.4$ , $V_{\text{Ca}}=120$ , $g_{\text{K}}=8$ , $V_{\text{K}}=-84$ , $I=60$ , $N_{\text{K}}=100$ .

5.2 Classical and Multilevel Monte Carlo estimators

In this section we introduce the classical and multilevel Monte Carlo estimators in order to estimate the quantity $\mathbb{E}\left[F(x_{T})\right]$ where $(x_{t},t\in[0,T])$ is the PDMP 2d-ML and $F(\theta,\nu)=\nu$ for $(\theta,\nu)\in E$ so that $F(x_{T})$ gives the value of the membrane potential at time $T$ . Note that other possible choices are $F(\theta,\nu)=\nu^{n}$ or $F(\theta,\nu)=\theta^{n}$ for some $n\geq 2$ . In those cases, the quantity $\mathbb{E}\left[F(x_{T})\right]$ gives the moments of the membrane potential or the number of open gates at time $T$ so that we can compute statistics on these biological variables.

Let $X:=F(x_{T})$ . In the sequel it will be convenient to emphasize the dependence of the Euler scheme $(\overline{x}_{t})$ on a time step $h$ . We introduce a family of random variables $(X_{h},h>0)$ defined by $X_{h}:=F(\overline{x}_{T})$ where for a given $h>0$ the corresponding PDP $(\overline{x}_{t})$ is constructed as in section 2.3 with time step $h$ . In particular, the processes $(\overline{x}_{t})$ for $h>0$ are correlated through the same randomness $(U_{k})$ , $(V_{k})$ and $(N^{*}_{t})$ . We build a classical Monte Carlo estimator of $\mathbb{E}[X]$ based on the family $(X_{h},h>0)$ as follows

[TABLE]

where $(X_{h}^{k},k\geq 1)$ is an i.i.d sequence of random variables distributed like $X_{h}$ . The parameters $h>0$ and $N\in\mathbb{N}$ have to be determined. We build a multilevel Monte Carlo estimator based on the family $(X_{h},h>0)$ as follows

[TABLE]

where $\left((X_{h_{l}}^{k},X_{h_{l-1}}^{k}),k\geq 1\right)$ for $l=2,\ldots,L$ are independent sequences of independent copies of the couple $(X_{h_{l}},X_{h_{l-1}})$ and independent of the i.i.d sequence $(X_{h^{*}}^{k},k\geq 1)$ . The parameter $h^{*}$ is a free parameter that we fix in section 5.4. The parameters $L\geq 2$ , $M\geq 2$ , $N\geq 1$ and $q=(q_{1},\ldots,q_{L})\in]0,1[^{L}$ with $\sum_{l=1}^{L}q_{l}=1$ have to be determined, then we set $N_{l}:=\lceil Nq_{l}\rceil$ , $h_{l}:=h^{*}M^{-(l-1)}$ .

We also set $\tilde{X}:=F(\tilde{x}_{T})\tilde{R}_{T}$ where $\tilde{R}_{T}$ is defined as in Proposition 2.2 with an intensity $\tilde{\lambda}$ and a kernel $\tilde{Q}$ that will be specified in section 5.4 and let $(\tilde{X}_{h},h>0)$ be such that $\tilde{X}_{h}:=F(\underline{\tilde{x}}_{T})\underline{\tilde{R}}_{T}$ for all $h>0$ . By Proposition 2.2, we have $\mathbb{E}[X]=\mathbb{E}[\tilde{X}]$ and $\mathbb{E}[X_{h}]=\mathbb{E}[\tilde{X}_{h}]$ for $h>0$ . Consequently, we build likewise a multilevel estimator $\tilde{Y}^{\text{MLMC}}$ based on the family $(\tilde{X}_{h},h>0)$ .

The complexity of the classical Monte Carlo estimator $Y^{\text{MC}}$ depends on the parameters $(h,N)$ and the one of the multilevel estimators $Y^{\text{MLMC}}$ and $\tilde{Y}^{\text{MLMC}}$ depends on $(L,q,N)$ . In order to compare those estimators we proceed as in [21] (see also [24]), that is to say, for each estimator we determine the parameters which minimize the global complexity (or cost) subject to the constraint that the resulting L2-error must be lower than a prescribed $\epsilon>0$ .

As in [21], we call $V_{1}$ , $c_{1}$ , $\alpha$ , $\beta$ and $\text{Var}(X)$ the structural parameters associated to the family $(X_{h},h>0)$ and $X$ . We know theoretically from Theorem 3.1 (strong estimate) and Theorem 4.1 (weak expansion) that $(\alpha,\beta)=(1,1)$ whereas $V_{1}$ , $c_{1}$ and $\text{Var}(X)$ are not explicit (we explain how we estimate them in section 5.3). Moreover, the structural parameters $\tilde{V}_{1}$ , $\tilde{c}_{1}$ , $\tilde{\alpha}$ , $\tilde{\beta}$ and $\text{Var}(\tilde{X})$ associated to $(\tilde{X}_{h},h>0)$ and $\tilde{X}$ are such that $\tilde{\alpha}=\alpha$ , $\tilde{c}_{1}=c_{1}$ (see (34)), $\tilde{\beta}=2$ (see Theorem 3.2) and $\tilde{V}_{1}$ , $\text{Var}(\tilde{X})$ are not explicit.

The classical and the multilevel estimators defined above are linear and of Monte Carlo type in the sense described in [21]. The optimal parameters of those estimators are then expressed in term of the corresponding structural parameters as follows (see [21] or [24]). For a user prescribed $\epsilon>0$ , the classical Monte Carlo parameters $h$ and $N$ are

[TABLE]

where $\rho=\sqrt{V_{1}/\text{Var}(X)}$ . The parameters of the estimator $Y^{\text{MLMC}}$ are given in Table 1 where $n_{l}:=M^{l-1}$ for $l=1,\ldots,L$ with the convention $n_{0}=n_{0}^{-1}=0$ . The parameters of $\tilde{Y}^{\text{MLMC}}$ are given in a similar way using $\tilde{V}_{1}$ , $\tilde{\beta}$ and $\text{Var}(\tilde{X})$ . Finally, the parameter $M(\epsilon)$ is determined as in [21] section 5.1.

5.3 Methodology

We compare the classical and the multilevel Monte Carlo estimators in term of precision, CPU-time and complexity. The precision of an estimator $Y$ is defined by the L2-error $\parallel Y-\mathbb{E}[X]\parallel_{2}=\sqrt{(\mathbb{E}[Y]-\mathbb{E}[X])^{2}+\text{Var}(Y)}$ also known as the Root Mean Square Error (RMSE). The CPU-time represents the time needed to compute one realisation of an estimator. The complexity is defined as the number of time steps involved in the simulation of an estimator. Let $Y$ denote the estimator (40) or (41). We estimate the bias of $Y$ by

[TABLE]

where $Y^{1},\ldots,Y^{R}$ are $R$ independent replications of the estimator. We estimate the variance of $Y$ by

[TABLE]

where $v^{1},\ldots,v^{R}$ are $R$ independent replications of $v$ the empirical variance of $Y$ . In the case where $Y$ is the crude Monte Carlo estimator we set

[TABLE]

If $Y$ is the MLMC estimator, we set

[TABLE]

where $m^{(1)}_{N_{1}}=\frac{1}{N_{1}}\sum_{k=1}^{N_{1}}X_{h}^{k}$ and for $l\geq 2$ , $m^{(l)}_{N_{l}}=\frac{1}{N_{l}}\sum_{k=1}^{N_{l}}X_{h_{l}}^{k}-X_{h_{l-1}}^{k}$ . Then, we define the empirical RMSE $\widehat{\epsilon}_{R}$ by

[TABLE]

The numerical computation of (43) for both estimators (40) and (41) requires the computation of the optimal parameters given by (42) and in table 1 of section 5.2 which are expressed in term of the structural parameters $c_{1}$ , $V_{1}$ and $\text{Var}(X)$ . Moreover the computation of the bias requires the value $\mathbb{E}[X]$ . Since there is no closed formula for the mean and variance of $X$ we estimate them using a crude Monte Carlo estimator with $h=10^{-5}$ and $N=10^{6}$ . The constants $c_{1}$ and $V_{1}$ are not explicit, we use the same estimator of $V_{1}$ as in [21] section $5.1$ , that is

[TABLE]

and we use the following estimator of $c_{1}$

[TABLE]

The estimator of $c_{1}$ is obtained writing the weak error expansion for the two time steps $h$ and $h/M$ , summing and neglecting the $O(h^{2})$ term. In (44) we use $(h,M)=(0.1,4)$ and in (45), we use $(h,M)=(1,4)$ and the expectations are estimated using a classical Monte Carlo of size $N=10^{4}$ on $(X_{h/M},X_{h})$ . We emphasize that we interested in the order of $c_{1}$ and $V_{1}$ so that we do not need a precise estimation here.

5.4 Numerical results

In this section we first illustrate the results of Theorems 3.1 and 3.2 on the Morris-Lecar PDMP, then we compare the MC and MLMC estimators. The simulations were carried out on a computer with a processor Intel Core i5-4300U CPU @ 1.90GHz $\times$ 4. The code is written in C++ language. We implement the estimator $\tilde{Y}^{\text{MLMC}}$ (see section 5.2) for the following choices of the parameters $(\tilde{\lambda},\tilde{Q})$ .

Case 1: $\tilde{\lambda}(\theta)=1$ and $\tilde{Q}\Big{(}\theta,\{\theta+1\}\Big{)}=\frac{N_{\text{K}}-\theta}{N_{\text{K}}}$ , $\tilde{Q}\Big{(}\theta,\{\theta-1\}\Big{)}=\frac{\theta}{N_{\text{K}}}$ .

Case 2: $\tilde{\lambda}(x,t)=\lambda(\theta,v(t))$ and $\tilde{Q}((x,t),dy)=Q((\theta,v(t)),dy)$ where $v$ denotes the first component of the solution of (39).

Cases 1 and 2 correspond to the application of Proposition 2.2. Based on Corollary 2.2 we also consider the following case.

Case 3: Consider the quantity $\mathbb{E}[F(x_{T})-F(\tilde{x}_{T})]$ where $(x_{t})$ and $(\tilde{x}_{t})$ are PDPs with characteristics $(\Phi,\lambda,Q)$ and $(\tilde{\Phi},\lambda,Q)$ respectively. By Corollary 2.2, we have $\mathbb{E}[F(\tilde{x}_{T})]=\mathbb{E}[F(y_{T})\tilde{R}_{T}]$ where $(y_{t})$ is a PDP whose discrete component jumps in the same states and at the times as the discrete component of $(x_{t})$ do and $(\tilde{R}_{t})$ is the corresponding corrective process. Thus, we consider the quantity $\mathbb{E}[F(x_{T})-F(y_{T})\tilde{R}_{T}]$ instead of $\mathbb{E}[F(x_{T})-F(\tilde{x}_{T})]$ .

The case 3 implies to use the following MLMC estimator which is slightly different from (41).

[TABLE]

where $\left((X_{h_{l}}^{k},\tilde{X}_{h_{l-1}}^{k}),k\geq 1\right)$ for $l=2,\ldots,L$ are independent sequences of independent copies of the couple $(X_{h_{l}},\tilde{X}_{h_{l-1}})=(F(\overline{x}_{T}),F(\overline{y}_{T})\tilde{R}_{T})$ where $(\overline{y}_{t})$ is a PDP whose discrete component jumps in the same states and at the same times as the Euler scheme $(\overline{x}_{t})$ with time step $h_{l}$ do, whose deterministic motions are given by the approximate flows with time step $h_{l-1}$ and $(\tilde{R}_{t})$ is the corresponding corrective process (see Corollary 2.2).

The figure 2 confirms numerically that $\mathbb{E}[|X_{h_{l}}-X_{h_{l-1}}|^{2}]=O(h_{l})$ and that $\mathbb{E}[|\tilde{X}_{h_{l}}-\tilde{X}_{h_{l-1}}|^{2}]=O(h_{l}^{2})$ for the cases 1,2 and 3 (see Theorems 3.1 and 3.2 respectively). Indeed, for $T=10$ (see figure 2(a)), we observe that the curve corresponding to the decay of $\mathbb{E}[|X_{h_{l}}-X_{h_{l-1}}|^{2}]$ as $l$ increases is approximately parallel to a line of slope -1 and that the curves corresponding to the decay of $\mathbb{E}[|\tilde{X}_{h_{l}}-\tilde{X}_{h_{l-1}}|^{2}]$ in the cases 1,2 and 3 are parallel to a line of slope -2. We also see that the curves corresponding to the cases 2 and 3 are approximately similar and that for some value of $l$ those curves go below the one corresponding to $\mathbb{E}[|X_{h_{l}}-X_{h_{l-1}}|^{2}]$ . The curve corresponding to the case 1 is always above all the other ones, this indicates that the L2-error (or the variance) in the case 1 is too big (w.r.t the others) and that is why we do do not consider this case in the sequel. As $T$ increases (see figures 2(b) and 2(c)), the theoretical order of the numerical schemes is still observed. However, for $T=20$ , a slight difference begin to emerge between the cases 2 and 3 (the case 3 being better) and this difference is accentuated for $T=30$ so that we do not represent the case 2.

For the Monte Carlo simulations we set $T=30$ , $\lambda^{*}=10$ and the time step involved in the first level of the MLMC is set to $h^{*}=0.1$ . We choose this value for $h^{*}$ because it represents (on average) the size of an interval $[T^{*}_{n},T^{*}_{n+1}]$ of two successive jump times of the auxiliary Poisson process $(N^{*}_{t})$ . The estimation of the true value and variance leads $\mathbb{E}[X]=-31.4723$ and Var $(X)=335$ . Note that $v(30)=-35.3083$ where $v$ is the deterministic membrane potential solution of (39) so that there is an offset between the deterministic potential and the mean of the stochastic potential. We replicate 100 times the simulation of the classical and multilevel estimators to compute the empirical RMSE so that $R=100$ in (43).

The results of the Monte Carlo simulations are shown in tables 2 for the classical Monte Carlo estimator $Y^{\text{MC}}$ and in tables 3 and 4 for the multilevel estimators $Y^{\text{MLMC}}$ and $\tilde{Y}^{\text{MLMC}}$ (case 3). As an example, the first line of table 3 reads as follows: for a user prescribed $\epsilon=2^{-1}=0.5$ , the MLMC estimator $Y^{\text{MLMC}}$ is implemented with $L=2$ levels, the time step at the first level is $h^{*}=0.1$ , this time step is refined by a factor $n_{l}=M^{l-1}$ with $M=2$ at each levels and the sample size is $N=2600$ . For such parameters, the numerical complexity of the estimator is $\text{Cost}(Y^{\text{MLMC}})=28200$ , the empirical RMSE $\widehat{\epsilon}_{100}=0.389$ and the computational time of one realisation of $Y^{\text{MLMC}}$ is $0.362$ seconds. We also reported the empirical bias $\widehat{b}_{100}$ and the empirical variance $\widehat{v}_{100}$ in view of (43).

The results indicate that the MLMC outperforms the classical MC. More precisely, for small values of $\epsilon$ (i.e $k=1,2,3$ ) the complexity and the CPU-time of the classical and the multilevel MC estimators are of the same order. As $\epsilon$ decreases (i.e as $k$ increases) the difference in complexity and CPU-time between classical and multilevel MC increases. Indeed, for $k=5$ the complexity of the estimator $Y^{\text{MC}}$ is approximately 13 times superior to the one of $Y^{\text{MLMC}}$ and 19 times superior to the one of $\tilde{Y}^{\text{MLMC}}$ . The same fact appears when we look at the complexity ratio of the estimators $Y^{\text{MLMC}}$ and $\tilde{Y}^{\text{MLMC}}$ (i.e Cost( $Y^{\text{MLMC}}$ )/Cost( $\tilde{Y}^{\text{MLMC}}$ )) as $\epsilon$ decreases. However, the difference between the complexity of these two MLMC estimators increases more slowly than the one between a MC and a MLMC estimator. Recall that the computational benefit of the MLMC over the MC grows as the prescribed $\epsilon$ decreases.

Both classical and multilevel estimators provide an empirical RMSE which is close to the prescribed precision (see tables 2, 3 and 4). We can conclude that the choice of the parameters is well adapted. For the readability, figures 3(a), 3(b) show the ratios of the complexities and the CPU-times of the three estimators $Y^{\text{MC}}$ , $Y^{\text{MLMC}}$ and $\tilde{Y}^{\text{MLMC}}$ as a function of $\epsilon$ .

Bibliography28

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] D.F. Anderson and D.J. Higham. Multilevel Monte Carlo for continuous time Markov chains, with applications in biochemical kinetics. Multiscale Model. Simul. , 10(1):146–179, 2012.
2[2] D.F. Anderson, D.J. Higham, and Y. Sun. Complexity of Multilevel Monte Carlo tau-leaping. SIAM Journal on Numerical Analysis , 52(6):3106–3127, 2014.
3[3] M. Benaïm, S. Le Borgne, F. Malrieu, and P-A. Zitt. Quantitative ergodicity for some switched dynamical systems. Electron. Commun. Probab. , 17:14 pp., 2012.
4[4] P. Brémaud. Point Processes and Queues, Martingale Dynamics . Springer-Verlag New York Inc, 1981.
5[5] M.H.A. Davis. Piecewise-deterministic Markov processes: A general class of non-diffusion stochastic models. Journal of the Royal statistical Society , 46:353–388, 1984.
6[6] M.H.A. Davis. Markov Models and Optimization . Chapman and Hall, London, 1993.
7[7] S. Dereich. Multilevel Monte Carlo algorithms for Lévy-driven SD Es with Gaussian correction. The Journal of Applied Probability , 21(1):283–311, 2011.
8[8] S. Dereich and F. Heidenreich. A Multilevel Monte Carlo algorithm for Lévy-driven Stochastic Differential Equations. Stochastic Processes and their Applications , 121(7):1565–1587, 2011.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Thinning and Multilevel Monte Carlo for Piecewise Deterministic (Markov) Processes.

Abstract

1 Introduction

2 Piecewise Deterministic Process by thinning

2.1 Construction

Assumption 2.1**.**

2.2 Approximation of a PDP

Lemma 2.1**.**

Proof of Lemma 2.1.

2.3 Application to the construction of a PDMP and its associated Euler scheme

Assumption 2.2**.**

2.4 Thinning representation for the marginal distribution of a PDP

Proposition 2.1**.**

Proposition 2.2**.**

Corollary 2.1**.**

Remark 2.1**.**

Remark 2.2**.**

Corollary 2.2**.**

Proof of Proposition 2.1.

Proof of Proposition 2.2.

3 Strong error estimates

Assumption 3.1**.**

Theorem 3.1**.**

Remark 3.1**.**

Assumption 3.2**.**

Theorem 3.2**.**

Definition 3.1**.**

3.1 Preliminary lemmas

Lemma 3.1**.**

Proof of Lemma 3.1.

Lemma 3.2**.**

Proof of Lemma 3.2.

3.2 Proof of Theorem 3.1

3.3 Proof of Theorem 3.2

4 Weak error expansion

Assumption 4.1**.**

Assumption 4.2**.**

Theorem 4.1**.**

Remark 4.1**.**

4.1 Further results on PDMPs: Itô and Feynman-Kac formulas

Definition 4.1**.**

Definition 4.2**.**

Remark 4.2**.**

Theorem 4.2**.**

Proof of Theorem 4.2.

Theorem 4.3** (PDMP’s Feynman-Kac formula [6]).**

4.2 Proof of Theorem 4.1

5 Numerical experiment

5.1 The PDMP 2-dimensional Morris-Lecar

5.2 Classical and Multilevel Monte Carlo estimators

5.3 Methodology

5.4 Numerical results

Assumption 2.1.

Lemma 2.1.

Assumption 2.2.

Proposition 2.1.

Proposition 2.2.

Corollary 2.1.

Remark 2.1.

Remark 2.2.

Corollary 2.2.

Assumption 3.1.

Theorem 3.1.

Remark 3.1.

Assumption 3.2.

Theorem 3.2.

Definition 3.1.

Lemma 3.1.

Lemma 3.2.

Assumption 4.1.

Assumption 4.2.

Theorem 4.1.

Remark 4.1.

Definition 4.1.

Definition 4.2.

Remark 4.2.

Theorem 4.2.

Theorem 4.3 (PDMP’s Feynman-Kac formula [6]).