Statistical inference for misspecified ergodic L\'evy driven stochastic differential equation models
Yuma Uehara

TL;DR
This paper demonstrates that Gaussian quasi-likelihood estimators for misspecified ergodic Lévy-driven SDEs are reliable, achieving asymptotic normality and tail probability bounds, thus confirming their practical effectiveness.
Contribution
It extends the Gaussian quasi-likelihood approach to misspecified ergodic Lévy-driven SDE models, providing theoretical guarantees under misspecification.
Findings
Estimators satisfy tail probability estimates.
Estimators achieve asymptotic normality.
Method remains effective under model misspecification.
Abstract
This paper deals with the estimation problem of misspecified ergodic L\'evy driven stochastic differential equation models based on high-frequency samples. We utilize the widely applicable and tractable Gaussian quasi-likelihood approach which focuses on (conditional) mean and variance structure. It is shown that the corresponding Gaussian quasi-likelihood estimators of drift and scale parameters satisfy tail probability estimates and asymptotic normality at the same rate as correctly specified case. In this process, extended Poisson equation for time-homogeneous Feller Markov processes plays an important role to handle misspecification effect. Our result confirms the practical usefulness of the Gaussian quasi-likelihood approach for SDE models, more firmly.
| (i) (0.33,1.41) | (ii) (0.37, 1.41) | (iii) (0.37, 1.41) | diffusion (0.33, 1.41) | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| 50 | 1000 | 0.05 | 0.38 | 1.41 | 0.40 | 1.39 | 0.40 | 1.39 | 0.38 | 1.41 |
| (0.12) | (0.11) | (0.16) | (0.29) | (0.15) | (0.19) | (0.13) | (0.10) | |||
| 100 | 5000 | 0.02 | 0.37 | 1.41 | 0.39 | 1.39 | 0.38 | 1.39 | 0.36 | 1.41 |
| (0.09) | (0.08) | (0.11) | (0.23) | (0.11) | (0.15) | (0.09) | (0.08) | |||
| 100 | 10000 | 0.01 | 0.36 | 1.41 | 0.37 | 1.39 | 0.38 | 1.40 | 0.36 | 1.41 |
| (0.08) | (0.07) | (0.09) | (0.22) | (0.10) | (0.15) | (0.08) | (0.07) | |||
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic processes and financial applications · Financial Risk and Volatility Modeling · Statistical Methods and Inference
Statistical inference for misspecified ergodic Lévy driven stochastic differential equation models
Yuma Uehara
The Institute of Statistical Mathematics, Japan, 10-3 Midori-cho, Tachikawa, Tokyo 190-8562, Japan
Abstract.
We consider the estimation problem of misspecified ergodic Lévy driven stochastic differential equation models based on high-frequency samples. We utilize a widely applicable and tractable Gaussian quasi-likelihood approach which focuses on mean and variance structure. It is shown that the Gaussian quasi-likelihood estimators of the drift and scale parameters still satisfy polynomial type probability estimates and asymptotic normality at the same rate as the correctly specified case. In their derivation process, the theory of extended Poisson equation for time-homogeneous Feller Markov processes plays an important role. Our result confirms the reliability of the Gaussian quasi-likelihood approach for SDE models.
Key words and phrases:
Lévy driven stochastic differential equation, misspecified model, Gaussian quasi-likelihood estimation, extended Poisson equation, high-frequency sampling, stepwise estimation.
1. Introduction
Recent development of measurement technique and computers enables us to get time series data observed at high frequency from financial and economic activities, physical and biological phenomena, and so on. In analyzing them, we often encounter situations where they vividly demonstrate non-Gaussian behavior, and in such situations, statistical modeling by high-frequently observed diffusion type processes may bring us an inadequate result. To better describe such non-Gaussianity, stochastic differential equations (SDEs) driven by Lévy processes serve as good candidate models. For such reason, the estimation theory of Lévy driven SDE models based on high-frequency samples has been studied so far, for instance, the threshold based estimation for jump diffusion models by [38] and [45], the least absolute deviation (LAD)-type estimation for Lévy driven Ornstein-Uhlenbeck models by [28], the non-Gaussian stable quasi-likelihood estimation for locally stable driven SDE models by [32], the least square estimation for small Lévy driven SDE models by [25], the Gaussian quasi-likelihood (GQL) for ergodic Lévy driven SDE models by [29] and [33], and so on. These are on parametric methods, and concurrently, nonparametric methods have been investigated, for example, the functional estimation and adaptive estimation for jump diffusion models by [3] and [44], Nadaraya-Watson estimation for stable driven SDE models by [44], and the Fourier based method for Lévy process and Lévy type model by [7], to mention few.
In statistical modeling, we always face the risk of model misspecification. The statistical theory under model misspecification tells us how close an estimated model is to the data-generating model, and such interpretation is important, for example, in ensuring the reliability of estimation methods, and comparing candidate description models by information criterions. Historically, following pioneering works by [8], [18], and [53], the theory has been investigated up to the present for such reasons. Especially about SDE models, for instance, [35], [46] and [24, Section 3] focus on misspecified diffusion models; [24, Section 4] deal with the misspecification with respect to the intensity function of Poisson processes; [26] considers the situation where the given model is diffusion but the data-generating model has jumps. However, the theory does not seem to be well developed in the context of Lévy driven SDE models whose coefficients take various non-linear form, and indeed the parametric methods introduced above are not discussed under model misspecification.
In this paper, the data-generating process which is defined on the complete filtered probability space is supposed to be the solution of the following Lévy driven SDE:
[TABLE]
where:
- •
is a one-dimensional càdlàg Lévy process without Wiener part. It is independent of the initial variable and satisfies , , and for all ;
- •
The coefficients and are Lipschitz continuous;
- •
.
We suppose that the discrete but high-frequency observations are obtained from in the so-called “rapidly increasing experimental design”, that is, , , and . For the observations , we suppose that the following parametric one-dimensional SDE model is allocated:
[TABLE]
where the functional forms of the coefficients and are supposed to be known except for a finite-dimensional unknown parameter being an element of the bounded convex domain . We note that the true coefficients may not belong to the parametric family , namely, the misspecification of the coefficient possibly occurs. Hereinafter, the terminologies “misspecified” and “misspecification” will be used for the misspecification with respect to the coefficients unless otherwise mentioned.
To estimate an optimal parameter of , we utilize the GQL procedure used in [33]. Concerning misspecified ergodic diffusion models, it is shown in [46] that although the misspecification with respect to their diffusion term deteriorates the convergence rate of the scale (diffusion) parameter, the Gaussian quasi-maximum likelihood estimator (GQMLE) still has asymptotic normality. We will show that asymptotic normality of the GQMLE holds in the misspecified ergodic Lévy driven SDE models as well. To handle the misspecification effect, we will invoke the theory of the extended Poisson equation (EPE) for homogeneous Feller Markov processes established in [49]. Applying the result of [49] for (1.1), the existence and weighted Hölder regularity of the solution of EPEs will be shown under a mighty mixing condition on . Building on the result and martingale representation theorem, we will be able to get the asymptotic normality of our estimator and its tail probability estimates under sufficient regularity and moment conditions on the ingredients of (1.1) and (1.2). We note that the absence of Wiener part in (1.1) is essential while it is not in the correctly specified case, for more details, see Remark 3.10.
It will turn out that the convergence rate of the scale parameters is , and it is the same as the correctly specified case. This is different from the diffusion case (cf. Table 1). Such difference may be caused from applying the GQL to non-Gaussian driving noises, that is, the efficiency loss of the GQMLE may occur even in the correctly specified case. Indeed, the non-Gaussian stable quasi-likelihood is known to estimate the drift and scale parameters faster than the GQMLE in correctly specified locally -stable driven SDE models (cf. [32]); each of their convergence rates are and , respectively. Further, for correctly specified locally -stable driven Ornstein-Uhlenbeck models, the LAD-type estimators of [28] tend to the true value at the speed of and it is also faster than that of the GQMLE. However, in exchange for its efficiency, the GQL approach is worth considering by the following reasons:
- •
It does not include any special functions (e.g. Bessel function, Whittaker function, and so on), infinite expansion series and analytically unsolvable integrals, thus computation based on it is not relatively time-consuming.
- •
It focuses only on the (conditional) mean and covariance structure, thus it does not need so much restriction on the driving noise and is robust against the noise structure. In other words, we can construct reasonable estimators of the drift and scale coefficients in the unified way if only the driving Lévy noise has moments of any order.
Our result ensures that even if the true coefficients are misspecified and take non-linear forms, the staged GQL estimation still works for Lévy driven SDE models and completely inherits its merit written in above.
The rest of this paper is organized as follows: In Section 2, we introduce assumptions and our estimation procedure. Section 3 provides our main results in the following turn:
- (1)
the tail probability estimates of the GQMLE (Theorem 3.1); 2. (2)
the existence and weighted Hölder regularity of the solution of EPEs for Lévy driven SDEs (Proposition 3.5); 3. (3)
the asymptotic normality of the GQMLE at -rate (Theorem 3.7).
A simple numerical experiment is presented in Section 4. We give all proofs of our results in Section 5.
2. Assumptions and Estimation scheme
For notational convenience, we will hereafter use the following manners without any mention:
- •
stands for the law of .
- •
denotes the closure of any set .
- •
represents the Lévy measure of .
- •
We write for any vector .
- •
denotes the transition probability of .
- •
is referred to as a differential operator for any variable .
- •
denotes the conditional expectation with respect to .
- •
We write and for any stochastic process .
- •
implies that there exists a positive constant being independent of satisfying for all large enough .
- •
For any matrix valued function on , we write ; especially we write .
- •
Given a function and a signed measure on a one-dimensional Borel space, we write
[TABLE]
To derive our asymptotic results, we introduce some assumptions with some technical comments. Most of them are almost the same as in [30], [33], and [34], except for Assumption 2.1-(2).
Assumption 2.1**.**
- (1)
, , and for all . 2. (2)
The Blumenthal-Getoor index (BG-index) of is smaller than 2, that is,
[TABLE]
From [42, Theorem 25.3], it is easy to observe that Assumption 2.1 holds if the Lévy measure admits a density with respect to Lebesgue measure satisfying that as for some , and that there exist positive constants , and such that
[TABLE]
for all large enough . Via standardization, various Lévy processes fulfill them, for example, bilateral gamma process, normal tempered stable process, normal inverse Gaussian process, and variance gamma process.
In the derivation of the asymptotic normality of our estimator, we will evaluate the small time -moment of for some (cf. Lemma 5.3) to handle the solution of extended Poisson equations which are essential to deal with the misspecification effect; thus the additional condition Assumption 2.1-(2) is imposed.
Assumption 2.2**.**
- (1)
The coefficients and are Lipschitz continuous and twice differentiable, and their first and second derivatives are of at most polynomial growth. 2. (2)
The drift coefficient and scale coefficient are Lipschitz continuous, and for every . 3. (3)
For each and , the following conditions hold:
- •
The coefficients and admit extension in and have the partial derivatives possessing extension in .
- •
There exists nonnegative constant satisfying
[TABLE]
We note that the first part of Assumption 2.1 and Assumption 2.2 ensures the existence of a unique càdlàg adapted strong solution of SDE (1.1) (cf. [2, Theorem 6.2.3 and Theorem 6.2.9]), that is, there exists a measurable function such that .
Assumption 2.3**.**
- (1)
There exists a probability measure such that for every , we can find constants and for which
[TABLE]
for any where . 2. (2)
For any , we have
[TABLE]
The former property of this assumption is so-called “-exponentially ergodic” property (cf. [36]), and putting together with the latter condition and the argument in [20, Lemma 8] and [30, Lemma 4.3], it ensures the ergodic theorem, and its moment bound: for any being differentiable with derivatives of polynomial growth, we have
[TABLE]
and for any positive constant ,
[TABLE]
The first convergence in probability (2.4) is a standard condition assumed in the statistical theory of the ergodic processes, while the second moment bound (2.5) is not and is relatively strong. It will be utilized for evaluating the tail probability of the staged GQL random field introduced later. Such evaluation gives the tail probability estimates of our estimator (Theorem 3.1), and in turn, the convergence of moments of any order for it (Remark 3.9).
The sufficient conditions of the “-exponentially ergodic” property for (1.1) are investigated by many papers such as [23], [27], and [30]. Among them, we introduce a handy one given in [30, Section 5] in the following:
Condition 1 The coefficients and are of class , and globally Lipschitz, and the scale coefficient is bounded.
Condition 2 The drift coefficient satisfies
[TABLE]
and the scale coefficient , for every .
Condition 3 The Lévy measure of can be decomposed as: for the two Lévy measure, where the restriction of to some open set of the form with some admits a continuously differentiable positive density .
Condition 4 and for some .
Under Condition 1-Condition 4, Assumption 2.3 holds true and for its proof, see [30, Proposition 5.4]. We here note that this sufficient condition still allows the nonlinearity of the coefficients. For example, given a Lévy process fulfilling Condition 3 and Condition 4, the following SDEs satisfy Condition 1, Condition 2, and Assumption 2.2-(1):
- (1)
; 2. (2)
; 3. (3)
.
We introduce a -matrix whose components are defined by:
[TABLE]
Assumption 2.4**.**
* is invertible.*
We define an optimal parameter of by
[TABLE]
where and are defined as follows:
[TABLE]
Note that since we impose the extension condition in Assumption 2.2, admit extension in as well. Recall that the parameter space is supposed to be a bounded convex domain. We assume the following identifiability condition for and :
Assumption 2.5**.**
, and there exist positive constants and such that for all ,
[TABLE]
(2.9) and (2.10) ensure the separability of the models which will also be used for the tail probability estimates of the staged GQL random fields, and the next remark provides a sufficient and non-stringent condition for them.
Remark 2.6**.**
If the optimal parameter is unique, and and are positive definite, (2.9) and (2.10) hold true for all under Assumption 2.2-2.3. Let and be
[TABLE]
From Assumption 2.2 and 2.3, the Lebesgue dominated convergence theorem implies that these functions are continuous. Thus, for sufficiently small , we can pick a positive constant satisfying and where denotes the open ball of radius centered at , and is a minimum eigenvalue of . Then, for every , we have by Taylor’s formula. Concerning , it follows that
[TABLE]
Hence (2.9) holds true for all with
[TABLE]
(2.10) can be shown as well.
From now on, we mention our estimation scheme. Recall that we assume that the observation is obtained from with , , and . We define our staged GQMLE in the following manner:
- (1)
Drift-free estimation of . Define the Maximizing-type estimator (so-called -estimator) by
[TABLE]
for the -valued random function
[TABLE] 2. (2)
Weighted least square estimation of . Define the least square type estimator by
[TABLE]
for the -valued random function
[TABLE]
Remark 2.7**.**
Although our estimation method ignores the drift term in the first stage, the effect of it asymptotically vanishes. This is because the scale term dominates the small time behavior of in -sense. Specifically, we can derive
[TABLE]
for suitable functions and . Indeed, it has already been shown that the asymptotic behavior of the scale estimator constructed by our manner is the same as the conventional GQL estimator in the case of correctly specified ergodic diffusion models (cf. [47]) and ergodic Lévy driven SDE models (cf. [33]). Such ignorance should be helpful in reducing the number of simultaneous optimization parameters, thus our estimator is expected to numerically be more stabilized and their calculation should be less time-consuming. Moreover, by choosing appropriate functional forms, each estimation stage is reduced to a convex optimization problem. For example, if and are linear and log-linear with respect to parameters, respectively, then the above argument holds. As for other candidates of their functional form and details, see [33, Example 3.8].
Remark 2.8**.**
*We defined the optimal parameter of as the argmax point of and and, the two functions are the probability limit of the Gaussian quasi-likelihoods and , respectively. Thus, and can be regarded as Kullback-Leibler (KL) divergence like quantities between the data-generating model and the parametric model . Here we first consider the correctly specified case, that is, there exists an element such that and for a.s. . Fix a positive constant . Then, it can readily be checked that for all , , and that both sides are equivalent when . Hence, by Assumption 2.5, and coincide with and , respectively. In other words, this asserts that the data-generating model certainly attain the minimization of and . By taking these insight into consideration, we can intuitively interpret the optimality of as the parameter value which yields the closest model to the data-generating model measured by the Kullback-Leibler (KL) divergence like quantities and . *
3. Main results
In this section, we state our main results only for the fully misspecified case, that is, both of the true coefficients and do not belong to the parametric family . Concerning the partly misspecified case (i.e. either of and is correctly specified), similar results can be derived just as the corollaries (see, Remark 3.8). All of their proofs will be given in Appendix.
The first result provides the tail probability estimates of the normalized which is theoretically essential such as in the deviation of an information criterion, residual analysis, and the measurement of -prediction error.
Theorem 3.1**.**
Suppose that Assumptions 2.1-2.4 hold. Then, for any and , there exists a positive constant such that
[TABLE]
In the correctly specified case, such estimates are already shown in [33] under a sufficient moment and regularity conditions, and strong identifiability conditions, and this theorem extends the results to the misspecified case.
Before we state the asymptotic normality of , we roughly explain how the misspecification effect arises in its derivation process, and introduce the useful tool to deal with it. Except for term, each scaled quasi-score function can be decomposed as:
[TABLE]
where the misspecification effect term is expressed as:
[TABLE]
with a specific measurable function satisfying . The celebrated CLT-type theorems for such single functional integration of Markov processes have been reported in many literatures, for example, [9, Theorem 2.1], [19, Theorem VIII 3.65], [21, Theorem 2.1], [50, Corollary 4.1], and the references therein. However the combination with the stochastic integral makes it difficult to clarify the asymptotic behavior of the left-hand-side. To handle this difficulty, we invoke the concept of the extended Poisson equation (EPE) introduced in [49]:
Definition 3.2**.**
[49, Definition 2.1]** We say that a measurable function belongs to the domain of the extended generator of a càdlàg homogeneous Feller Markov process taking values in if there exists a measurable function such that the process
[TABLE]
is well defined and is a local martingale with respect to the natural filtration of and every measure . For such a pair , we write and .
Remark 3.3**.**
In the previous definition, the terminology “Feller” means that the corresponding transition semigroup is a mapping into . When it comes to , its homogeneous, Feller and (strong) Markov properties are guaranteed by the argument in [2, Theorem 6.4.6] and [27, 3.1.1 (ii)].
Remark 3.4**.**
When we consider the misspecified ergodic diffusion models, we also encounter the annoying integral term like (3.3). In that case, [46] utilized the theory of the second order differential equations endowed with their infinitesimal generator (cf. [39]) and Itô’s formula to derive the asymptotic normality of the GQMLE. However, in our case, the same method cannot be applied since the infinitesimal generator of contains the integro-operator with respect to the Lévy measure of and it is difficult to verify the existence and regularity of the corresponding equation.
Hereinafter is referred to as the -th component of any vector . We consider the following EPEs:
[TABLE]
for the extended generator of , and . The right-hand-side of each EPE corresponds to in (3.2), and it is trivial that they identically 0 when the coefficients are correctly specified.
From now on, is referred to as the expectation operator with the initial condition , that is,
[TABLE]
for any measurable function . The next proposition ensures the existence of the solutions of (3.4) and (3.5) and verifies their weighted Hölder continuity:
Proposition 3.5**.**
Under Assumption 2.1-2.3, there exist unique solutions of (3.4) and (3.5), and the solution vectors and satisfy
[TABLE]
where any , , and some positive constants and . Furthermore,
[TABLE]
and
[TABLE]
are -martingale with respect to for every , and their explicit forms are given as follows:
[TABLE]
Remark 3.6**.**
Thanks to the result of the previous theorem and assumptions on the coefficients,
[TABLE]
and
[TABLE]
have finite second-order moments. Thus, slightly refining the argument in [41, the proof of Proposition VII 1.6] with the monotone convergence theorem, the -martingale property of them with respect to can be replaced by the -martingale property with respect to in the previous proposition.
Building on the previous proposition, now we can obtain the asymptotic normality of :
Theorem 3.7**.**
Under Assumptions 2.1-2.4, there exists a nonnegative definite matrix such that
[TABLE]
and the form of is given by:
[TABLE]
Remark 3.8**.**
If either of the coefficients is correctly specified, the right-hand side of the associated EPE (3.4) or (3.5) is identically 0. Let and be the elements of and whose definitions are introduced in Rem 2.8. Then we have
[TABLE]
in the case that the scale coefficient is correctly specified and
[TABLE]
in the case that the drift coefficient is correctly specified.
Remark 3.9**.**
Let be a random variable which obeys . As a consequence of Theorem 3.1 and Theorem 3.7, we have
[TABLE]
for any polynomial growth function . It can be shown in the following way: For any , it follows from [13, Lemma 2.2.8] and Theorem 3.1 that
[TABLE]
Hence is asymptotically uniformly integrable from Markov’s inequality, and [48, Theorem 2.20] implies (3.6).
Remark 3.10**.**
In this remark, we suppose that the data-generating model defined on the probability space is supposed to be
[TABLE]
where is a standard Wiener process independent of , and is a measurable function. We look at the following parametric model:
[TABLE]
where is a measurable function. Here other ingredients are similarly defined as above and we use the same notations for its transition probability, invariant measure, and so on. When the true coefficients are correctly specified, the GQMLE still has asymptotic normality and the sufficient conditions for it are easy to check (cf. [29]). However, we note that it is difficult to give such conditions when they are misspecified. This is because our methodology using the martingale representation theorem becomes insufficient due to the presence of Wiener component in the deviation of the asymptotic variance (see, the proof of Theorem 3.7). To formally derive a similar result to Theorem 3.7, we may additionally have to impose the following condition:
Condition A: There exists a unique -solution on of
[TABLE]
where is a specific function satisfying
[TABLE]
Furthermore, the first and second derivatives of are of at most polynomial growth.
Under Condition A, the limit distribution of the GQMLE can be derived by combining the proof of [47] and Theorem 3.7. It is known that the theory of viscosity solutions for integro-differential equations ensures the existence of in limited situation, for instance, see [4], [5], [16] and [17]. However, it is not so for the regularity of . As another attempt to confirm Condition A, the associated EPE may possibly be helpful. This is because the existence and uniqueness of the solution of the EPE can be verified in an analogous way to Theorem 3.5, and if admits -property and growth conditions in Condition A, then satisfies (3.8). The latter argument can formally be shown as follows:
It is enough to check . Since is a martingale with respect to for all , we have
[TABLE]
Hence it follows from Itô’s formula that as ,
[TABLE]
In this sketch, we implicitly assume suitable regularity and moment conditions on each ingredient, but they are reduced to be conditions on the true coefficients . Thus, verifying the behavior of
[TABLE]
leads to Condition A. Just for Lévy driven Ornstein-Uhlenbeck models, we can observe the property of based on the explicit form of the solution (cf. Example 3.11). Although, for general Lévy driven SDEs, the gradient estimates of their transition probability making use of Malliavin calculus have been investigated lately (cf. [51], [52], and the references therein), the property of is still difficult to be checked as far as the author knows. Since these are out of range of this paper, we will not treat them later.
Example 3.11**.**
Here we consider the following Ornstein-Uhlenbeck model:
[TABLE]
for a Lévy process not necessarily being pure-jump type and a positive constant . Applying Itô’s formula to , we have
[TABLE]
and
[TABLE]
for a suitable function . Here is the probability distribution function of whose characteristic function is given by:
[TABLE]
for (cf. [43, Theorem 3.1]). In this case, fulfills Assumption 2.3 provided that Assumption 2.1-(1) holds, and that the Lévy measure of has a continuously differentiable positive density on an open neighborhood around the origin (for more details, see [29, Section 5]). Then, the characteristic function of the invariant measure is given by
[TABLE]
Under such condition, if is differentiable and itself and its derivative are of at most polynomial growth, we have
[TABLE]
for a positive constant K. We can derive similar estimates with respect to its higher-order derivatives in the same way.
Let be a Lévy process such that its moments of any-order exists and its triplet is (cf. [2]). Here is allowed to be [math]. Mimicking the previous example, we write as the probability distribution function of for a positive constant and stands for below. Combining the argument in Remark 3.10 and Example 3.11, we obtain the following corollary:
Corollary 3.12**.**
For a natural number , let be a polynomial growth -function whose derivatives are of at most polynomial growth. Suppose that the integral of with respect to the Borel probability measure whose characteristic function is is [math], and that has a continuously differentiable positive density on an open neighborhood around the origin. Then, the function
[TABLE]
on is the unique solution of the following (first or second order) integro-differential equation
[TABLE]
and moreover, is also a polynomial growth -function.
Remark 3.13**.**
If the Lévy measure is symmetric (i.e. the imaginary part of is 0), the equation (3.10) is solvable for many odd functions as a matter of course. More specifically, for and , the solution is
[TABLE]
By observing the derivatives of the characteristic function, can be expressed by the moments of , hence the explicit expression of is available.
Remark 3.14**.**
Beside the estimation of , what is of special interest is the inference for which may often be an infinite dimensional parameter. Even for being constant and specified (i.e. is a Lévy process with drift), it may be interest in its own right and enormous papers have addressed this problem so far. We refer to [31] for comprehensive accounts under being assumed to have a certain parametric structure. As for the situation where just a few information on is available, one of plausible attempts is the method of moments proposed in [14], [15], and [37], for example. Especially [37] established a Donsker-type functional limit theorem for empirical processes arising from high-frequently observed Lévy processes. When the coefficients and are nonlinear functions but specified, the residual based method of moments for by [34] is effective: using the GQMLE , we have
[TABLE]
for an appropriate -valued function and a matrix which can be constructed only by the observations. For instance, we can choose and (to estimate the -th cumulant of and the cumulant function of , respectively) as ; see [34, Assumption 2.7] for the precise conditions on . As for misspecified case, if the misspecification is confined within the drift coefficient, then this scheme is still valid thanks to the faster diminishment of the mean activity in small time (cf. Remark 2.7).
4. Numerical experiments
We suppose that the data-generating model is the following Lévy driven Ornstein-Uhlenbeck model:
[TABLE]
and that the parametric model is described as:
[TABLE]
The functional form of the coefficients is the same in [46, Example 3.1]. We conduct numerical experiments in three situations: (i) , (ii) , and (iii) . (normal inverse Gaussian) random variable is defined by the normal mean-variance mixture of inverse Gaussian random variable, and (bilateral Gamma) random variable is defined by the difference of two independent Gamma random variables. For their technical accounts, we refer to [6] and [22]. To visually observe their non-Gaussianity, each density function at is plotted with the density of in Figure 1 altogether. By taking the limit of (3.9), the characteristic function of the invariant measure is given by
[TABLE]
where . Differentiating , we have for the -th cumulant (resp. ) of (resp. ). Hence we obtain
[TABLE]
By solving the estimating equations, the target optimal values are given by
[TABLE]
In the calculation, we used and . Thus, in each case, the optimal parameter is given as follows: (i) , (ii) , and (iii) (we write approximated values obtained by rounding off to four decimal places). Solving the corresponding estimating equations, our staged GQMLE are calculated as:
[TABLE]
We generated 10000 paths of each SDE based on Euler-Maruyama scheme and constructed the estimators along with the above expressions, independently. In generating the small time increments of the driving noises, we used the function rng equipped to YUIMA package in R [10]. Together with the diffusion case , the mean and standard deviation of each estimator is shown in Table 2 where and denote the sample size and observation interval, respectively. We also present their boxplots to enhance the visibility. We can observe the followings from the table and boxplots:
- •
Overall, the estimation accuracy of improves as and increase and decrease, and this tendency reflects our main result.
- •
The result of case (i) is almost the same as the diffusion case. This is thought to be based on the well-known fact that tends to in total variation norm as for any . Indeed, Figure 1 shows that the density functions of and are virtually the same.
- •
Concerning case (ii), the standard deviation of is relatively worse than the other cases. This is natural because the asymptotic variance of includes the forth-order-moment of , and has the highest kurtosis value as can be seen from Figure 1.
- •
In case (iii), the performance of is the worst in this experiment. This may cause from the fact that only is not symmetric.
5. Appendix
Throughout the proofs, for functions on , we will sometimes write and instead of and just for simplicity.
Proof of Theorem 3.1 In light of our situation, it is sufficient to check the conditions [A1”], [A4’] and [A6] in [54] for and , respectively. For the sake of convenience, we simply write and below. Without loss of generality, we can assume . First we treat . The conditions hold if we show
[TABLE]
for any . The first two derivatives of are given by
[TABLE]
We further decompose as
[TABLE]
Since the optimal parameter is in , the interchange of the derivative and the integral implies that the function is centered in the sense that its integral with respect to is 0. Thus [30, Lemma 4.3] and [34, Lemma 5.3] lead to (5.1) and (5.4). We also have
[TABLE]
Again applying [30, Lemma 4.3] and [34, Lemma 5.3], we obtain (5.2). Via simple calculation, the third and fourth-order derivatives of can be represented as
[TABLE]
with the matrix-valued functions and defined on , and these are of at polynomial growth with respect to uniformly in . Hence (5.3) follows from Sobolev’s inequality (cf. [1, Theorem 1.4.2]). Thus [54, Theorem 3-(c)] leads to the tail probability estimates of . From Taylor’s expansion, we get
[TABLE]
Sobolev’s inequality leads to
[TABLE]
for . The last two terms of the right-hand-side are finite from [34, Lemma 5.3], and the moment bounds of the three functions () can analogously be obtained. Thus combined with the tail probability estimates of and Schwartz’s inequality, it suffices to show the conditions for
[TABLE]
instead of and , respectively. Since their estimates can be proved in a similar way to the first half, we omit the details.
To derive Proposition 3.5, we prepare the next lemma. For metric on , we define the coupling distance between any two probability measures and by
[TABLE]
where denotes the set of all probability measures on with marginals and . is called the probabilistic Kantrovich-Rubinstein metric (or the first Wasserstein metric). The following assertion gives the exponential estimates of :
Lemma 5.1**.**
If Assumption 2.3 holds, then for any , there exists a positive constant such that for all ,
[TABLE]
Proof.
We introduce the following Lipschitz semi-norm for a suitable real-valued function on :
[TABLE]
From Kantorovich-Rubinstein theorem (cf. [12, Theorem 11.8.2]) and Assumption 2.3, it follows that for all ,
[TABLE]
∎
Proof of Proposition 3.5 It is enough to check the conditions of [49, Theorem 3.1.1 and Theorem 3.1.3] for . As was mentioned in the proof of Theorem 3.1,
[TABLE]
and
[TABLE]
are centered. In the following, we give the proof concerning and omit its index for simplicity. The regularity conditions on the coefficients imply that there exist positive constants and such that
[TABLE]
Making use of the trivial inequalities and for any , and , we have
[TABLE]
for any . Recall that we put in Assumption 2.3. The inequality (2.2) gives
[TABLE]
We write for abbreviation. Building on this estimate and the previous lemma, the conditions of [49, Theorem 3.1.1 and Theorem 3.1.3] are satisfied with
[TABLE]
and here these symbols correspond to the ones used in [49]. As for , the conditions can be checked as well. Hence the desired result follows.
To derive the asymptotic normality of , the following CLT-type theorem for stochastic integrals with respect to Poisson random measures will come into the picture:
Lemma 5.2**.**
Let be a Poisson random measure associated with one-dimensional Lévy process defined on a stochastic basis whose Lévy measure is written as . Assume that a continuous vector-valued function on and a -predictable process satisfy:
- (1)
For all and ,
[TABLE]
and their exists a positive definite matrix such that
[TABLE]
as ; 2. (2)
there exists such that
[TABLE]
as .
Then, for the associated compensated Poisson random measure , we have
[TABLE]
as .
Proof.
By Cramer-Wold device, it is sufficient to show only one-dimensional case. This proof is almost the same as [11, Theorem 14. 5. I]. For notational brevity, we set
[TABLE]
Introduce a stopping time . Note that because is continuous. Define a random function by
[TABLE]
Applying Itô’s formula, we obtain
[TABLE]
For later use, we here present the following elementary inequality (cf. [13]): for all and ,
[TABLE]
By the definition of , we have . Since is an -martingale (cf. [2, Section 4]) from these estimates, the optional sampling theorem implies that
[TABLE]
Next we show that
[TABLE]
Again using the above estimates, we have
[TABLE]
where is a positive constant such that
[TABLE]
for all . At last we observe that . In view of Lenglart’s inequality and the isometry property of stochastic integral with respect to Poisson random measure (cf. [2, Section 4]), it suffices to show . However the latter convergence is clear from Assumption (1). Hence the proof is complete. ∎
Next we show the following lemma which gives the fundamental small time moment estimate of :
Lemma 5.3**.**
Under Assumptions 2.1-2.3, it follows that
[TABLE]
for any positive constant and .
Proof.
Recall that from Assumption 2.1. By Lipschitz continuity of the coefficients and [14, Theorem 1.1], it follows that
[TABLE]
Applying Burkholder-Davis-Gundy’s inequality (cf. [40, Theorem 48]), we have
[TABLE]
for the Poisson random measure associated with . Hence Gronwall’s inequality gives (5.6). ∎
Proof of Theorem 3.7 According to Cramer-Wold device, it is enough to show for . From a similar estimates used in Theorem 3.1, we have
[TABLE]
We evaluate each term separately below. Rewriting in a stochastic integral form via Itô’s formula, we have
[TABLE]
for the compensated Poisson random measure associated with . Using Burkholder’s inequality and the isometry property, it follows that for a positive constant ,
[TABLE]
and that
[TABLE]
Hence
[TABLE]
Let us turn to observe . Let for , and especially, let . From Proposition 3.5, we obtain
[TABLE]
For abbreviation, we simply write
[TABLE]
According to Proposition 3.5, the weighted Hölder continuity of , and Lemma 5.3, turns out to be an -martingale. Thus the martingale representation theorem [19, Theorem III. 4. 34] implies that there exists a predictable process such that
[TABLE]
Hence the continuous martingale component of is 0. By the property of , we can define the stochastic integral on and this process is also an -martingale with respect to . Utilizing [19, Theorem I. 4. 52] and [40, Corollary II. 6. 3], we have
[TABLE]
Here denotes the quadratic variation for any semimartingale at time , and we used Burkholder’s inequality for a martingale difference between the first line and the second line. By similar estimates above, we have . Having these arguments in hand, it turns out that
[TABLE]
We can deduce from Assumption 2.2 and Proposition 3.5 that there exist positive constants and such that for all
[TABLE]
and the last term is -integrable. Then, there exist positive constants and (possibly take different values from the previous ones) such that for any ,
[TABLE]
Thus the dominated convergence theorem and the isometry property give
[TABLE]
It follows from Assumption 2.3 and Proposition 3.5 that
[TABLE]
From Taylor expansion around , is decomposed as:
[TABLE]
Sobolev’s inequality and the tail probability estimates of imply that the third term of the right-hand-side is . Hence a similar manner to the first half leads to
[TABLE]
and we have
[TABLE]
From the isometry property and the trivial identity for any , it follows that
[TABLE]
Hence the moment estimates in the proof of Theorem 3.1, Lemma 5.2 and Taylor’s formula yield that
[TABLE]
To achieve the desired result, it suffices to show
[TABLE]
and
[TABLE]
However the first two convergence are straightforward from the proof of Theorem 3.1, and the last convergence follows from the ergodic theorem. Thus the proof is complete.
Acknowledgement
The author would like to thank Professor H. Masuda for his constructive comments. He is also thank Professor A. M. Kulik for his advice about extended Poisson equation. Finally, he is grateful to the anonymous referees for their valuable and constructive comments. This work was supported by JST CREST Grant Number JPMJCR14D7, Japan.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] R. A. Adams. Some integral inequalities with applications to the imbedding of Sobolev spaces defined over irregular domains. Trans. Amer. Math. Soc. , 178:401–429, 1973.
- 2[2] D. Applebaum. Lévy processes and stochastic calculus , volume 116 of Cambridge Studies in Advanced Mathematics . Cambridge University Press, Cambridge, second edition, 2009.
- 3[3] F. M. Bandi and T. H. Nguyen. On the functional estimation of jump-diffusion models. J. Econometrics , 116(1-2):293–328, 2003. Frontiers of financial econometrics and financial engineering.
- 4[4] G. Barles, R. Buckdahn, and E. Pardoux. Backward stochastic differential equations and integral-partial differential equations. Stochastics Stochastics Rep. , 60(1-2):57–83, 1997.
- 5[5] G. Barles and C. Imbert. Second-order elliptic integro-differential equations: viscosity solutions’ theory revisited. Ann. Inst. H. Poincaré Anal. Non Linéaire , 25(3):567–585, 2008.
- 6[6] O. E. Barndorff-Nielsen. Processes of normal inverse Gaussian type. Finance Stoch. , 2(1):41–68, 1998.
- 7[7] D. Belomestny and M. Reiß. Estimation and calibration of Lévy models via Fourier methods. In Lévy matters. IV , volume 2128 of Lecture Notes in Math. , pages 1–76. Springer, Cham, 2015.
- 8[8] R. H. Berk. Limiting behavior of posterior distributions when the model is incorrect. Ann. Math. Statist. 37 (1966), 51–58; correction, ibid , 37:745–746, 1966.
