Optimal sampling design for global approximation of jump diffusion SDEs
Pawe{\l} Przyby{\l}owicz

TL;DR
This paper analyzes the optimal sampling strategies for accurately approximating jump diffusion SDEs driven by Poisson and Wiener processes, establishing convergence rates and constructing asymptotically optimal methods.
Contribution
It determines the exact convergence rate of minimal errors for approximating jump diffusion SDEs and constructs optimal Milstein-based methods using various sampling schemes.
Findings
Nonequidistant sampling is more efficient than equidistant sampling.
Optimal methods asymptotically attain the minimal possible errors.
The convergence rate of approximation errors is precisely characterized.
Abstract
The paper deals with strong global approximation of SDEs driven by two independent processes: a nonhomogeneous Poisson process and a Wiener process. We assume that the jump and diffusion coefficients of the underlying SDE satisfy jump commutativity condition. We establish the exact convergence rate of minimal errors that can be achieved by arbitrary algorithms based on a finite number of observations of the Poisson and Wiener processes. We consider classes of methods that use equidistant or nonequidistant sampling of the Poisson and Wiener processes. We provide a construction of optimal methods, based on the classical Milstein scheme, which asymptotically attain the established minimal errors. The analysis implies that methods based on nonequidistant mesh are more efficient than those based on the equidistant mesh.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
**Optimal sampling design for global approximation of jump diffusion SDEs 111 This research was partly supported by the Polish NCN grant - decision No. DEC-2013/09/B/ST1/04275 and by AGH local grant.
**
Paweł Przybyłowicz
*AGH University of Science and Technology,
Faculty of Applied Mathematics,
Al. Mickiewicza 30, 30-059 Krakow, Poland,
E-mail:* [email protected]
Abstract
The paper deals with strong global approximation of SDEs driven by two independent processes: a nonhomogeneous Poisson process and a Wiener process. We assume that the jump and diffusion coefficients of the underlying SDE satisfy jump commutativity condition (see Chapter 6.3 in [21]). We establish the exact convergence rate of minimal errors that can be achieved by arbitrary algorithms based on a finite number of observations of the Poisson and Wiener processes. We consider classes of methods that use equidistant or nonequidistant sampling of the Poisson and Wiener processes. We provide a construction of optimal methods, based on the classical Milstein scheme, which asymptotically attain the established minimal errors. The analysis implies that methods based on nonequidistant mesh are more efficient than those based on the equidistant mesh.
Key words: nonhomogeneous Poisson process, Wiener process, jump commutativity condition, standard information, minimal strong error, asymptotically optimal algorithm
Mathematics Subject Classification: 68Q25, 65C30.
1 Introduction
We investigate the global approximation for the following jump diffusion stochastic differential equations (SDEs)
[TABLE]
driven by two independent processes: a nonhomogeneous one–dimensional Poisson process with intensity function and one-dimensional Wiener process . We assume, without the loss of generality, that . Jump diffusion SDEs (1) appear in various fields such as e.g. physics, biology, engineering and mathematical finance, see, for example, [1], [9], [29], [30] and pages 43-44 in [21]. We are interested in efficient algorithms that approximate whole trajectories of and use only discrete values of the driving Poisson and Wiener processes.
Approximation of stochastic differential equations only driven by a Wiener process has been widely investigated in the literature. In that case, upper bounds on the error of defined methods were established, see, for example, [14]. Lower bounds were also investigated for the strong approximation in the Wiener (Gaussian) case, see, for example, [8], [12], [18]-[20] and [23]-[25].
In the jump diffusion case suitable approximation schemes were provided and upper bounds on their errors discussed, for example, in the monograph [21] and in the articles [3], [6], [9]-[11] and [16]. However, according to the author‘s best knowledge, till now there is only one paper that deals with asymptotic lower bounds and exact rate of convergence of the minimal errors for the global approximation of SDEs with jumps, see [26]. In that paper the author considered the pure jumps SDEs (1), i.e., and . We can also mention [4] where the authors investigated the optimal rate of convergence for the problem of approximating stochastic integrals of regular functions with respect to a homogeneous Poisson process. Here, we extend the approach used in [26] in order to cover more general SDEs of the form (1).
The purpose of this paper is to find lower bounds on the error and to define optimal methods solving (1). In the purely Gaussian case, similar question were considered, for example, in [12] and [18]. In order to study jump diffusion equations (1) driven by the Poisson and the Wiener processes a new technique is necessary. The main difference, comparing to the Gaussian case, is that we have to use some facts from the theory of stochastic integration with respect to càdlàg, square integrable martingales, see, for example, [15], [17], [22] and [29]. Moreover, we have to face the fact that when establishing the exact asymptotic constants the intensity of the process depends on time. This problem does not appear in [26], where the intensity is constant. The another thing is we assume that the coefficients and satisfy the jump commutativity condition. This condition is widely described and discussed in, for example, Chapter 6.3 in [21]. Roughly speaking, it assures for the construction of the Itô-Taylor schemes that we do not need to know the exact location of the jump times of the Poisson process . In this paper we widely use this condition when establishing asymptotic lower and upper bounds.
We consider three classes of approximation schemes denoted by , and , dependent on the sampling method for trajectories of the processes and . The class contains methods based on the equidistant discretization of . Methods using the same (but not necessarily equidistant) evaluation points for and belong to a wider class . Methods that can use different, but also not necessarily equidistant, sampling point for the processes and belong to . We have .
The main result of the paper, Theorem 4.2, states that for fixed , , , , and in the case when the underlying SDE (1) is driven by two processes and (i.e., and ) the following holds
[TABLE]
and
[TABLE]
where , . In (2) and (3) the method uses at most evaluations of and . By taking the infimum we mean that we choose mappings along with discretization points in the best possible way. For the subclass of we have
[TABLE]
while in we have that
[TABLE]
In (5) the infimum means that we only choose mappings in the best possible way, while the discretization of is fixed and uniform. As we can see, the order of convergence is , but the asymptotic constant in (5) may be considerably larger than that in (2), (3) and (4). In the class we have a small gap between the upper and lower asymptotic constants. We conjecture that the exact rate of convergence of the minimal errors in is the same as for . Note also that if and then we arrive at results known from [26], while if and then, for the classes and , we restore the results known from [12], see Remark 4.2.
The asymptotically optimal scheme is defined by a piecewise linear interpolation of the classical Milstein steps, performed at suitably selected discretization points. The discretization points are chosen as quantiles of a distribution corresponding to a density . It turns out that in the class the optimal density is proportional to . The main disadvantage of using such regular sampling is the need of using exact values of quantiles of that might be hard to compute in general. In Section 4.1 we present the exact computation of sampling points in the linear case (Merton‘s model).
The paper is organized as follows. In Section 2 we give basic notions and definitions. Asymptotic lower bounds on the minimal errors are established in Section 3, while asymptotically optimal methods are defined in Section 4. We chose such order of presentation due to the fact that the technique used when proving the lower bounds in Section 3 suggests definitions of the optimal methods in Section 4. Finally, Appendix contains proofs of auxiliary results used in the paper.
2 Preliminaries
Let be a given real number. We denote and . Let be a complete probability space. We consider on it two independent processes: a one-dimensional Wiener process and a one–dimensional nonhomogeneous Poisson process with continuous intensity function . Let denote the complete filtration generated by the driving processes and . We set and for . The process has independent increments where the increment has Poisson law with parameter and for , see [7] or [21]. The compensated Poisson process is defined as follows
[TABLE]
which is a zero mean, square integrable -martingale with càdlàg paths. For a random variable we write , , and , where is a sub--filed of . We say that a continuous function belongs to if for the partial derivatives exist and are continuous on , and can be continuously extended to . For a continuous function its modulus of continuity is , . If is a right-continuous process with left hand limits then we can define for all . We have that if and only if is continuous at . For the further properties of càdlàg mappings used in this paper see, for example, Chapter 2.9 in [1]. For we use the following notation
[TABLE]
We impose the following assumption on the mappings , , and on the intensity function :
- (A)
for .
- (B)
There exists such that for , for all and all
- (B1)
,
- (B2)
,
- (B3)
\Bigl{|}\frac{\partial f}{\partial y}(t,y)-\frac{\partial f}{\partial y}(t,z)\Bigl{|}\leq K|y-z|.
- (C)
There exists such that for , for all and all
[TABLE]
- (D)
The diffusion and the jump coefficients satisfy the following jump commutativity condition
[TABLE]
for all . (We refer to Chapter 6.3 in [21] where the condition (9) is widely discussed.)
- (E)
The intensity is continuous in .
The assumptions (B1) and (B2) imply for and all that
[TABLE]
where depends only on , and . Moreover, by (B1) and (B3) we have for and all that
[TABLE]
From (B1), (10) and (11) we get for and all that
[TABLE]
where .
Unless otherwise stated, all unspecified constants appearing in this paper may only depend on the constant from the assumptions (B)-(C), , , , , , and . Moreover, the same symbol might be used to denote different constants.
The assumptions (A)-(E) are rather standard when comparing to those known from the literature concerning approximations of jump diffusion SDEs, see the comment before Theorem 6.1. Only in Section 4.1 we impose additional assumption on the coefficients which, in fact, turns out to be necessary in order to define an optimal sampling from a probabilistic density function.
For , , and satisfying (B1), (B2) and (E) the equation (1) has a unique strong solution that is adapted to and has càdlàg paths, see [21], [22] or [30]. We have also the following moments estimates for the solution , see, for example, [22] or [21] .
Lemma 2.1
Let us assume that the mappings , , and satisfy the assumptions (B1), (B2) and (E). Then there exist positive constants , such that
[TABLE]
and for all
[TABLE]
The following result characterizes the local mean square smoothness of the solution in the terms of the process defined as follows
[TABLE]
Of course has càdlàg paths and it is adapted to . (See Fact 6.1 in Appendix for the further properties of used in the paper.)
Proposition 2.1
Let us assume that the mappings , , and satisfy the assumptions , and . Then for the solution of (1) we have that for all
[TABLE]
almost surely and, in particular,
[TABLE]
Proof. See the Appendix.
By Proposition 2.1 the square root of can be interpreted as a conditional Hölder constant of . This local smoothness will reflect in the exact rate of convergence of minimal errors established in Section 4. A result similar to Proposition 2.1 for SDEs driven by a multiplicative Wiener process has been obtained in [12], while for SDEs driven by an additive fractional Brownian motion with the Hurst parameter has been shown in Proposition 1 in [19].
The problem considered in the paper is to find an optimal strong global approximation of the solution of (1). For any fixed an approximation of is given by a method . The method computes the approximation by using some information about the functions , , and , the Poisson process and the Wiener process . We consider methods that are based on a finite number of observations of trajectories of the driving processes and at suitably chosen points from the interval . The cost of the method is measured by the total number of evaluations of the processes and .
We fix and we consider the corresponding equation (1). Any approximation method is defined by three sequences , , , where
[TABLE]
is a measurable mapping and
[TABLE]
is a partition of with
[TABLE]
for . We have that for all and, in particular, we might have for some . The sequences , provide (not necessary equidistant) discretizations of used by and , respectively. Mostly, in the literature, we have that see, for example, Chapter 6 in [21]. Here, mainly for the lower bound, we allow more general discretization. By
[TABLE]
we denote a sequence of vectors of size , which provides standard information with evaluations of the Poisson process and evaluations of the Wiener process at the discrete points from , i.e.,
[TABLE]
where for . Recall that . In particular, the sequences , may depend on functions , , , and on but not on trajectories of the processes and . (Information (22) uses the same evaluation points for all trajectories of the Poisson and Wiener processes.) Therefore, information (21) about the processes and is nonadaptive. Moreover, since does not have to be contained in , the information (22) is called nonexpanding, see [24]. We stress that our model of computation covers the regular strong Taylor approximations and it excludes the jump-adapted time discretizations, since we do not assume the knowledge of the jump times for (see Chapters 6 and 8 in [21]). This restriction reflects our assumption that only nonadaptive standard information is available for the process .
After computing the information , we apply the mapping in order to obtain the th approximation in the following way
[TABLE]
The th cost of the method is the total number of evaluations of and used by the th approximation , defined as follows
[TABLE]
(If then we take formally to be a zero vector and the sequence can be arbitrary; we use analogous convention in the case when .) The set of all methods , defined as above, is denoted by . Moreover, we consider the following subclasses of
[TABLE]
and
[TABLE]
Methods based on the sequence of equidistant discretizations (19) belong to the class while to the class belong methods that evaluates and at the same, possibly nonuniform, sampling points. We have that .
The th error of a method is defined as
[TABLE]
The th minimal error, in the respective class of methods under consideration, is defined by
[TABLE]
We will investigate the exact rate of convergence of the th minimal errors (28) together with asymptotic constants. Moreover, we wish to determine (asymptotically) optimal methods , , such that the th errors tend to zero as fast as when .
3 Asymptotic lower bounds
In this section we investigate asymptotic lower bounds for the problem (1) in the classes of methods , . In the next section we give a construction of approximation methods which are asymptotically optimal. Their definitions will be inspired by the technique used for establishing lower bounds given in this section.
We give the definition of the continuous Milstein approximation and we state its properties that we use in order to establish the lower bounds. Moreover, in next section we use it in order to construct asymptotically optimal methods.
Let and
[TABLE]
be an arbitrary discretization of . We denote by
[TABLE]
for . The continuous Milstein approximation based on (29) is defined as follows. We denote
[TABLE]
and we set
[TABLE]
and
[TABLE]
for , , where
[TABLE]
for . It is well-known that
[TABLE]
where is the th jump time of , and
[TABLE]
Moreover, , , , are independent of , see Fact 6.2 in Appendix.
The main properties of are as follows. For every the process is adapted to and has càdlàg paths. The upper bounds on the error of are given in Theorem 6.1. Furthermore, under the commutativity condition (9) the random variables are measurable with respect to the sigma filed
[TABLE]
In particular, this and independence of and imply that for all ,
[TABLE]
where
[TABLE]
The conditional expectations appearing above can be computed explicitly. Namely, from Lemma 8 in [8] and Lemma 6.2 in Appendix we get by direct calculations
[TABLE]
We stress that for any the approximation is not an implementable numerical scheme in our model of computation (even under the commutativity condition (9)), since computation of a trajectory of requires complete knowledge of a corresponding trajectories of and . However, if the condition (9) holds, by (35), (36) and (38), we can compute values of at the discrete points (29) using only function evaluations of and at (29).
In order to characterize asymptotic lower bounds we define
[TABLE]
where the process is defined in (15). We have that
- (i)
,
- (ii)
iff there exists such that for all
[TABLE]
- (iii)
iff iff for all and almost surely.
We have the following result.
Theorem 3.1
Let us assume that the mappings , , and satisfy the assumptions -.
- (i)
Let be an arbitrary method from . Then
[TABLE]
- (ii)
Let be an arbitrary method from . If and then
[TABLE]
- (iii)
Let be an arbitrary method from . If then
[TABLE]
else
[TABLE]
Proof. We start by showing (49) in the case when and . Let be a method based on an arbitrary sequence of discretizations and , where each and is of the form (19). Every uses information (22) about the processes and . Take any sequence of positive integers such that
[TABLE]
By we denote a sequence of discretizations given by , where every set of equidistant points is defined by . Hence, for all ,
[TABLE]
with
[TABLE]
and
[TABLE]
Therefore, from (53) and (56) we have that
[TABLE]
and, since for all ,
[TABLE]
We denote by , where each vector consists of the values of and at , i.e.,
[TABLE]
Since for all , we have that
[TABLE]
Let us denote by the sequence of continuous Milstein approximations (32)-(33) based on the sequence of discretizations and which use the information about the processes and . From Theorem 6.1 and (58) we have that
[TABLE]
where the positive constant does not depend on . Moreover, let
[TABLE]
for and . Note that for any the random variable is a convex combination of and . Hence, is independent of for all and the processes , are independent. From (58), (60), (61), (39) and Lemma 6.3 we get
[TABLE]
Now, we analyze the asymptotic behavior of the first term in (64). From Lemma 8 in [8] we have that
[TABLE]
For and we define
[TABLE]
Of course and it can be continuously extended to , since and are finite. Therefore, by Lemma 6.2 and from the mean value theorems we get
[TABLE]
for some , . Next, for we have from Theorem 6.1 that
[TABLE]
Therefore, for we have by (65), (3) and (3) that
[TABLE]
This together with the Hölder inequality imply
[TABLE]
We have that
[TABLE]
where
[TABLE]
and, by (13),
[TABLE]
since for all . From the uniform continuity of we get
[TABLE]
Hence, by (70), (71), (73) and Fact 6.1 (ii) we have
[TABLE]
Therefore, by (53), (64), (3) and (74) we obtain
[TABLE]
which ends the proof of (49) in the case when and . If or then ,
[TABLE]
and
[TABLE]
which yield
[TABLE]
For and we obtain trivial lower bound. Finally, if , and then and by (77) we get
[TABLE]
which completes the proof of (49). The proofs of (52) and (52) are straightforward modifications of the proofs of (49) and (50). Hence, we skip it here.
Remark 3.1
Theorem 3.1 gives nontrivial lower bounds only in the case when
[TABLE]
In this case the presented lower bounds still hold even if we allow for methods to use an arbitrary information about , and , for example, values of partial derivatives or values of arbitrary linear functionals. If for all then (1) becomes (almost surely) deterministic ODE. Then different lower bounds hold, see, for example, [13]. **
4 Asymptotically optimal methods
We provide definitions of methods that are asymptotically optimal. The construction is inspired by the technique used for establishing the lower bounds in the previous section. We restrict our consideration to approximation methods based on the regular sequences of discretizations generated by a probability density function , see [28]. For the density we assume that
- (P1)
and for all .
We will use the notation for a sequence of discretizations generated by a density . The knots
[TABLE]
of the th discretization are given by
[TABLE]
Hence, by choosing such a density one gets a whole sequence of discretizations . For instance, the sequence of equidistant discretizations is obtained by taking . Since , we have for all and
[TABLE]
We now provide a construction of asymptotically optimal approximation methods. The definition of this method is inspired by the dominating term in the estimation (64). Denote by the sequence of continuous Milstein approximations (32)-(33) based on the sequence of discretizations . For a given density , we define the method by
[TABLE]
where consists of values of the processes and at the points . (Hence, we formally take .) We call (83) the conditional Milstein method. We have that . We present an explicit formula for the algorithm (83) in order to show that it has a form that is allowed in our model of computation. By (9), (33), (83) and (41)-(45) each term can be written as
[TABLE]
for , and . Note that has continuous trajectories and coincides with at the discretization points. In general, the method is not equal to the piecewise linear interpolation of the classical Mistein steps, defined as
[TABLE]
for , , see Remark 4.3. However, we use the method in order to investigate the error of and we show in the sequel that they behave asymptotically in the same way. Moreover, for a fixed discretization the method does not evaluates and its implementation, at least in the case when , is straightforward.
In the following theorem we give the exact convergence rate of the errors for the methods and in the terms of the following asymptotic constant
[TABLE]
The strategy of the proof goes as follows. First, we analyze the error of the conditional Milstein method . Due to its definition given by the conditional expectation (83) this can be done by using some estimates already established in the proof of Theorem 3.1. Then we show that is sufficiently close to . This will give us the asymptotic error for the piecewise linear interpolation method .
Theorem 4.1
Let us assume that the mappings , , , and satisfy the assumptions - and , and let . Then if and
[TABLE]
else
[TABLE]
**Proof. **From Theorem 6.1 and (82) we get
[TABLE]
where a constant does not depend on . Moreover, the equality (81) and the integral mean value theorem yield
[TABLE]
As in the proof of Theorem 3.1, we use the notation
[TABLE]
for . From (89), (90), Lemma 6.3 and by proceeding analogously as in the proof of Theorem 3.1 we arrive at
[TABLE]
for some , . Moreover, we have
[TABLE]
where
[TABLE]
By Fact 6.1 (i), (13) and (82) we get
[TABLE]
[TABLE]
and
[TABLE]
This and the uniform continuity of imply
[TABLE]
By (4), (4), (97) and Fact 6.1 (ii) we obtain
[TABLE]
which ends the proof of (87) for .
We now analyze the error of . Note that
[TABLE]
for , . In addition
[TABLE]
for , . For and the random variable is -measurable and the estimate (180) holds for . Hence, it is independent of , and . Therefore, by (82) we have that
[TABLE]
for , . Since, from (100)
[TABLE]
we obtain (87) for . This ends the proof.
Let us now assume that the following additional assumption is satisfied:
- (P2)
.
The methods and obtain the exact rate of convergence , with the asymptotic constant which depends on . The best density , which is unique and minimizes among all positive mappings such that , is
[TABLE]
(The minimization property of follows from the application of the Hölder inequality.) We stress that is strictly positive in under the additional assumption (P2). Furthermore,
[TABLE]
The following fact characterizes the case when the equidistant sampling is the optimal one.
Fact 4.1
Let us assume that the mappings , , and satisfy the assumptions - and . Then the following assertions are equivalent.
- (i)
.
- (ii)
\displaystyle{\mathbb{E}(\mathcal{Y}(t))=\frac{1}{T}\int\limits_{0}^{T}\Bigl{(}\mathbb{E}(\mathcal{Y}(s))\Bigr{)}^{1/2}ds}* for all .*
- (iii)
.
Proof. The assertion can easily be shown by proving the implications and we left if for the reader.
From Theorem 4.1 we directly obtain the following result.
Corollary 4.1
Let us assume that the mappings , , and satisfy the assumptions -.
- (i)
Let us moreover assume that the assumption (P2) is satisfied. If and then for it holds
[TABLE]
else
[TABLE]
- (ii)
Let . If and then it holds
[TABLE]
else
[TABLE]
Theorem 3.1 and Corollary 4.1 imply the main result of the paper.
Theorem 4.2
Let us assume that the mappings , , and satisfy the assumptions -.
- (i)
Let us additionally assume that the assumption (P2) is satisfied. If (* and ) or ( and ) then*
[TABLE]
and the methods , where is defined in (102), are asymptotically optimal in the class . If and then
[TABLE]
- (ii)
If the assumption (P2) is satisfied, and then
[TABLE]
and the methods are asymptotically optimal in the class .
- (iii)
We have that
[TABLE]
and the methods are asymptotically optimal in the class .
As we can see the optimal rate of convergence of the minimal errors in the classes and is proportional to , where is a total number of evaluations of and . In the class we have a gap between upper and lower asymptotic constants. We conjecture that (108) holds also if and .
We end this section with the following remarks.
Remark 4.1
Theorem 4.2 implies that the error can be reduced asymptotically by the factor
[TABLE]
if we use the optimal discretization instead of the equidistant one. However, the optimal density and the optimal sampling , defined by
[TABLE]
can be computed explicitly only in particular cases see, for example, Section 4.1. Moreover, the additional assumption (P2) is required. We plan to overwhelm these difficulties in the future work.
Remark 4.2
If , and then, for the classes and , Theorem 4.2 restores the results of Theorem 2 (iii) and Proposition 2 from [12] in the Gaussian case, while if , and then we get Theorem 4.2 from [26] for the pure jump case with an additive Poisson noise. In addition to this paper, in [26] the author established a method based on an adaptive stepsize control that does not depend on the knowledge of . The problem of defining such methods for SDEs of the general type (1) will be the topic of our future work.
Remark 4.3
We have , if , and .
4.1 Linear case - Merton‘s jump diffusion model
Let us consider the following SDE
[TABLE]
that models the stock price in the Merton‘s model, see [21]. We assume to be a constant function and , . The solution of (111) is
[TABLE]
We denote and we have
[TABLE]
If then the optimal sampling is the equidistant one and . If then we obtain the following optimal sampling for (111)
[TABLE]
We have that for . Since behaves as when , we can gain by using the nonequidistant mesh.
5 Conclusions
We investigated the minimal asymptotic errors for strong global approximation of SDEs driven by the Poisson and Wiener processes. We considered the cases of equidistant and nonequidistant sampling of and . In both cases, we showed that the minimal error tends to zero like , where is an average in time of a local Hölder constant of and is the number of evaluations of and . However, the asymptotic constant in the case of equidistant sampling can be considerably larger than the asymptotic constant when nonuniform mesh is used. We provided a construction of methods that asymptotically achieve the established minimal errors.
In this paper, we addressed the case when sampling points for the processes and are chosen only in the nonadaptive way with respect to and . Moreover, we assume that the diffusion and jump coefficients satisfied the jump commutativity condition. For the adaptive sampling and non-commutative case preliminary considerations indicate that the direct application of methods developed in this paper is not possible. Further extension of the presented analysis is needed in that case and we postpone this problem to our future work.
**Acknowledgments
**Part of this work was done at Banff International Research Station for Mathematical Innovation and Discovery (BIRS), Alberta, Canada, where the author participated at the workshop ”Approximation of High-Dimensional Numerical Problems - Algorithms, Analysis and Applications”, Fall 2015. I would like to thank the Staff of the BIRS for great hospitality.
6 Appendix
We use the following version of the Itô formula for semimartingales with jumps, see, for example, [29] or [22].
Lemma 6.1
Let us assume that the mappings , , and satisfy the assumptions , and . Let a function belongs to . Then for the solution of (1) it holds
[TABLE]
The proof of the following fact is straightforward.
Fact 6.1
Let the mappings and satisfy the assumptions , and .
- (i)
There exists a constant such that for all and we have
[TABLE]
- (ii)
The mapping
[TABLE]
is continuous.
- (iii)
There exists a constant such that
[TABLE]
Fact 6.2
- (i)
There exists such that for all and we have
[TABLE]
- (ii)
For all and the stochastic integral is independent of .
Proof. The proof of (i) can be straightforwardly delivered from (6), (38), the isometry for stochastic integrals driven by martingales and by the independence of and . Hence, we skip it.
For the proof of (ii) note that directly from (35) and (36) we get that , , is independent of . So the only case of interest is when .
Fix , , and let , , be a sequence of discretizations of such that and , where . Moreover, let
[TABLE]
We have that
[TABLE]
Therefore, the sequence converges also in probability and, by the independence of the increments of and , every random variable is independent of . Hence, the limit is also independent of . By (38) we get that also is independent of .
The proof of Proposition 2.1. By the Markov property of the solution we have that . For all and such that we have
[TABLE]
From (10) and (E) we obtain that
[TABLE]
almost surely. By Theorem 88 in [29] we obtain for all and almost surely
[TABLE]
and
[TABLE]
since is a martingale. Therefore, by Minkowski‘s inequality for conditional expectations (see [5]), we have that
[TABLE]
almost surely. From (13), Fact 6.1 (iii) and the Lebesgue‘s dominated convergence theorem for conditional expectations (see [5]) we have for all and almost surely that
[TABLE]
and
[TABLE]
since and have càdlàg paths and is -measurable. This together with (6) yield (16). Now, (17) follows from (16) and Lebesgue‘s dominated convergence theorem.
Lemma 6.2
Let and let
[TABLE]
be an arbitrary discretization of the interval and
[TABLE]
Then for all and
- (i)
[TABLE]
almost surely,
- (ii)
[TABLE]
almost surely and, in particular,
[TABLE]
Proof. For , , we directly get (131), (132) and (133). By the results of [2], from the fact that the process has independent increments and by direct calculations we obtain that conditioned on and for , the increment is a binomial random variable with the number of trials and with the probability of success in each trial equal to . Now, the rest of proof goes analogously as the proof of Lemma 3.1 in [26].
We provide a result concerning an upper bound on the error for the continuous Milstein approximation . A similar result has been shown in Theorem 6.4.1 in [21], however, under slightly stronger assumptions. In particular, we do not assume the existence of continuous partial derivative for and we do not impose here any Lipschitz conditions on the second partial derivative of , , with respect to . Moreover, we consider here nonstationary Poisson process, while in [21] Theorem 6.4.1 has been proven for stationary point processes.
Theorem 6.1
Let us assume that the mappings , , and satisfy the assumptions , , and . Let and let (29) be an arbitrary discretization of the interval . Then for the continuous Milstein approximation , based on the mesh (29), we have that
[TABLE]
and
[TABLE]
where do not depend on .
Proof. Recall that and is -measurable for , . First, we show that
[TABLE]
We proceed by induction. Let us assume that for and some . (The assumption is fulfilled for .) By (10), (12), (33) and Fact 6.2 we have for all and that
[TABLE]
Hence, and, in particular, . Therefore, we get and (136).
We now justify (135). The solution of (1) and the continuous Milstein approximation can be written as
[TABLE]
where
[TABLE]
and
[TABLE]
We have for all that
[TABLE]
where
[TABLE]
We get from the Hölder inequality and Lemma 2.1 for all that
[TABLE]
From Lemma 6.1 applied to and (6) we have that for
[TABLE]
We denote for and
[TABLE]
We have
[TABLE]
where
[TABLE]
for all . By the Hölder inequality, (10), (11) and Lemma 2.1 we have
[TABLE]
By Theorem 6.5.8 in [17] and Theorem 88 (iii) in [29] we obtain for all
[TABLE]
[TABLE]
We estimate (151) analogously as (151) and we get for all that
[TABLE]
Hence, by (148), (153), (154) and (156) we arrive at
[TABLE]
for all . For (146) we have by the Hölder inequality and (A2) that
[TABLE]
for all . Hence, (143), (147), (157) and (158) yield for all that
[TABLE]
We have for all that
[TABLE]
where
[TABLE]
From the Itô isometry and the Hölder inequality we obtain for
[TABLE]
By the Itô isometry together with the Itô formula we get
[TABLE]
where
[TABLE]
for all . From the Hölder inequality we get
[TABLE]
for all . Since we have for that
[TABLE]
we obtain
[TABLE]
Moreover, for we have that
[TABLE]
which implies
[TABLE]
[TABLE]
Hence, for all we get
[TABLE]
[TABLE]
Therefore, by (160), (161), (173) and (174) we obtain for all
[TABLE]
Now
[TABLE]
where
[TABLE]
for all . Next, we use the decomposition and the martingale isometry. Then the estimation of the above terms goes in analogous way as for , hence, we skip it. We get for all that
[TABLE]
Combining (159), (175) and (177) we get for all
[TABLE]
By (13) and (136) the mapping is bounded and Borel measurable. Hence, by Gronwall‘s lemma we get (135). The estimate (134) is a consequence of (13) and (135). This ends the proof.
Lemma 6.3
Let us assume that the mappings , , and satisfy the assumptions -. For all ,
[TABLE]
where does not depend on nor .
Proof. From (12) and Theorem 6.1 we have for and that
[TABLE]
where does not depend on nor . Moreover, for and the random variable is -measurable. From Fact 6.2 (ii) and by (41)-(45) we have that , and are independent of . Hence, by (40), Fact 6.2 (i) and (180) we get
[TABLE]
for , which ends the proof of (179).
7 References
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Applebaum, D., Lévy Processes and Stochastic Calculus. 2nd ed., Cambridge University Press, 2011.
- 2[2] Bonet, E., Nualart, D., Interpolation and forecasting in Poisson‘s processes. Stochastica 2 (1977), 1–5.
- 3[3] Bruti-Liberati, N., Platen, E., Strong approximations of stochastic differential equations with jumps. J. Comput. Appl. Math. 205 (2007), 982–1001.
- 4[4] Debowski, J., Przybyłowicz, P., Optimal approximation of stochastic integrals with respect to a homogeneous Poisson process, submitted.
- 5[5] Doob, J. L., Measure Theory. Springer-Verlag New York, 1994.
- 6[6] Gardoń, A., The order of approximations for solutions of Itô-type stochastic differential equations with jumps. Stoch. Anal. Appl. 22 (2004), 679–699.
- 7[7] Graham, C., Talay, D., Stochastic Simulation and Monte Carlo Methods. Springer-Verlag, Berlin, Heidelberg, 2013.
- 8[8] Hertling, P., Nonlinear Lebesgue and Itô integration problems of high complexity. J. Complexity 17 (2001), 366–387.
