Rate-optimal estimation of the Blumenthal-Getoor index of a L\'evy process
Fabian Mies

TL;DR
This paper introduces a new estimator for the Blumenthal-Getoor index of Lévy processes that achieves the optimal convergence rate, improving upon existing methods especially when a diffusion component is present.
Contribution
The paper proposes a novel, rate-optimal estimator for the BG index and related parameters, applicable even with infinite variation jumps, using the generalized method of moments.
Findings
Estimator attains the optimal convergence rate.
Method effectively estimates parameters jointly.
Simulation shows superior finite sample performance.
Abstract
The Blumenthal-Getoor (BG) index characterizes the jump measure of an infinitely active L\'evy process. It determines sample path properties and affects the behavior of various econometric procedures. If the process contains a diffusion term, existing estimators of the BG index based on high-frequency observations only achieve rates of convergence which are suboptimal by a polynomial factor. In this paper, a novel estimator for the BG index and the successive BG indices is presented, attaining the optimal rate of convergence. If an additional proportionality factor needs to be inferred, the proposed estimator is rate-optimal up to logarithmic factors. Furthermore, our method yields a new efficient volatility estimator which accounts for jumps of infinite variation. All parameters are estimated jointly by the generalized method of moments. A simulation study compares the finite sample…
| GMM | JT | GMM | Reiß | Bull | ||
|---|---|---|---|---|---|---|
| 1.3 | 5/23400 | 0.04 | 0.07 | 0.19 | 0.28 | 0.59 |
| 1.3 | 1/23400 | 0.02 | 0.03 | 0.13 | 0.17 | 0.37 |
| 1.3 | 0.2/23400 | 0.007 | 0.010 | 0.08 | 0.10 | 0.25 |
| 1.7 | 5/23400 | 0.32 | 0.43 | 0.23 | 0.22 | 0.31 |
| 1.7 | 1/23400 | 0.16 | 0.22 | 0.11 | 0.11 | 0.30 |
| 1.7 | 0.2/23400 | 0.08 | 0.10 | 0.06 | 0.06 | 0.25 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Rate-optimal estimation of the Blumenthal–Getoor index of a Lévy process
Fabian Mies111RWTH Aachen University, Institute of Statistics, [email protected]
Abstract
The Blumenthal–Getoor (BG) index characterizes the jump measure of an infinitely active Lévy process. It determines sample path properties and affects the behavior of various econometric procedures. If the process contains a diffusion term, existing estimators of the BG index based on high-frequency observations only achieve rates of convergence which are suboptimal by a polynomial factor. In this paper, a novel estimator for the BG index and the successive BG indices is presented, attaining the optimal rate of convergence. If an additional proportionality factor needs to be inferred, the proposed estimator is rate-optimal up to logarithmic factors. Furthermore, our method yields a new efficient volatility estimator which accounts for jumps of infinite variation. All parameters are estimated jointly by the generalized method of moments. A simulation study compares the finite sample behavior of the proposed estimators with competing methods from the financial econometrics literature.
**Keywords: high-frequency; method of moments; jump activity; Fisher information; non-diagonal rate matrix; asymptotic distribution;
MSC 2000 subject classification: primary 62M05; secondary 60G51;**
1 Introduction
Models for continuous time stochastic processes with jumps have gained increased interest in the statistical literature, most prominently in financial econometrics where they are used as a model for asset prices (Andersen et al.,, 2002; Christensen et al.,, 2014). The jump behavior of these processes can be broadly characterized in terms of the jump activity index, given by
[TABLE]
Here, denotes the size of a jump at time . If is a Lévy process, is also known as the Blumenthal-Getoor index (Blumenthal and Getoor,, 1961). The index depends on the small jumps only, and for semimartingales, its range is . Various qualitative properties of the process can be expressed in terms of the jump activity index. If the process has only finitely many jumps in total, then , and if the jumps are of finite variation, we have . Conversely, implies jumps of finite variation. Furthermore, the value of has implications for various econometric procedures. For example, if the jumps are treated as a nuisance, jump-robust estimation of integrated volatility requires (Jacod and Reiss,, 2014), as well as an efficient drift estimator due to Gloter et al., (2018). In these applications, a higher jump activity typically induces a non-negligible bias which can not be easily corrected if the jumps are considered as a nuisance. Hence, highly active jumps need to be modeled more explicitly, as done by Amorino and Gloter, (2018) for drift estimation, and by Jacod and Todorov, (2014, 2016) for volatility estimation.
As the jump activity index is a central property of infinite activity jump models, it is natural to consider statistical estimation of its precise value. Recent interest in this topic has been initiated by Aït-Sahalia and Jacod, (2009), who study the estimation of based on discrete high-frequency observations , where is an Itô semimartingale with a non-vanishing diffusion component. They specify (1) more precisely by defining in terms of the spot jump compensator , assuming that as for a predictable process , and some . The statistical challenge is that, based on discrete observations at a given frequency, the small jumps can hardly be distinguished from the continuous diffusion movement. The solution of Aït-Sahalia and Jacod, (2009) is to introduce a threshold sequence and consider
[TABLE]
If , the contribution of the diffusion towards the statistic will be negligible. The jump activity can be identified via the approximate scaling relation , and Aït-Sahalia and Jacod, (2009) show that this approach lends itself to derive an estimator of with rate of convergence . Replacing the indicator in (2) by a suitable smooth function, Jing et al., (2012) improve this rate to . So far, the best rates have been achieved by Reiß, (2013) for the case that is a Lévy process, and by Bull, (2016) for Itô semimartingales. Both authors construct estimators which converge at rate for arbitrary . In both cases, the precise form of the estimator depends on the desired rate defect .
In the considered high-frequency setting, the optimal rate of convergence for estimating is conjectured to be , up to logarithmic factors. This lower bound is justified by the results of Aït-Sahalia and Jacod, (2012), who study the diagonal entries of the Fisher matrix of a fully parametric submodel consisting of the sum of a Brownian motion and a symmetric -stable Lévy motion. A matching LAN result is not available since the off-diagonal entries have not been studied. This lower bound is discussed in Section 3. It should be highlighted that the achievable rate of convergence for estimating depends on whether the process contains a non-vanishing diffusion component. If we consider a pure-jump Itô semimartingale, the jump activity index can be estimated at rate based on high-frequency observations (Todorov,, 2015).
Although the estimators of Reiß, (2013) and Bull, (2016) almost achieve the optimal rate of convergence, there is so far no procedure which attains the lower bound, even in the case where is a Lévy process. This issue has also been formulated as an open problem by Reiß, (2013). In this paper, we propose a new estimator of for the Lévy case. If only is unknown, the estimator achieves the optimal rate of convergence, matching the lower bound of Aït-Sahalia and Jacod, (2012). If an additional proportionality factor needs to be estimated, our estimator is rate-optimal up to a factor of for both and . Furthermore, we show that the diagonally rescaled Fisher matrix in the submodel considered by Aït-Sahalia and Jacod, (2012) is asymptotically singular for the combined parameter , and hence we conjecture that our rate of convergence is in fact optimal. Our procedure also yields an efficient estimator of the volatility of the diffusion component of in the presence of jumps of infinite variation. Under analogous conditions on the jump behavior, Jacod and Todorov, (2014, 2016) have derived a different efficient estimator of volatility which is robust to highly active jumps. Hence, our estimator is an alternative to the method of Jacod and Todorov, (2014), although the latter is valid for Itô semimartingales and we restrict our attention to Lévy processes. The proposed estimator is based on the generalized method of moments, and we estimate the jump and the diffusion parameters jointly in a single step as the solution of a system of estimating equations.
Our model allows for an asymmetric behavior of the small jumps. In particular, for a Lévy process with characteristic triplet , we suppose that the Lévy measure is locally stable in the sense that, for close to [math],
[TABLE]
Here, is a natural number, , , and the are the successive Blumenthal-Getoor indices, as introduced by Aït-Sahalia and Jacod, (2012). The approximation in (3) will be made precise in the sequel. In particular, the BG index of will be . We construct an estimator for the parameter vector consisting of the volatility , the indices , and the proportionality factors .
The remainder of this paper is structured as follows. In Section 2, we present our model and the proposed estimator. A central limit theorem is given, establishing the rate . The rate of convergence and related lower bounds are discussed in Section 3. By means of a simulation study (Section 4), we compare the finite sample properties of our method with the jump activity estimators of Bull, (2016); Reiß, (2013) and the volatility estimator of Jacod and Todorov, (2014). All technical results, which might be of independent interest, are outlined in Section 5.1, and the detailed proofs are gathered in Section 5.2.
1.1 Notation
For two real numbers , we denote , . The indicator function of a set is denoted as . For a function , denotes the partial derivative w.r.t. , and for a function with , the gradient matrix is denoted by . For , is the ball around [math] with radius in , where is evident from the context. denotes the identity matrix. The multivariate normal distribution with covariance matrix and mean [math] is denoted as , and denotes weak convergence of probability measures resp. random elements. The expectation operator is , and dependence upon a parameter is denoted as .
2 Model and estimator
Consider a univariate Lévy process , , with characteristic triplet for a drift parameter , volatility parameter , and a Lévy measure , i.e. . We choose an odd truncation function such that and for . Then admits the Lévy-Itô decomposition
[TABLE]
where is a Poisson point process with intensity measure , and is a standard Brownian motion, independent of . The value of depends on the choice of the truncation function , but for our purposes, it will turn out that is negligible anyways. To make the approximation (3) precise, we suppose that
[TABLE]
for some and . The approximating measure is given by the Lebesgue density
[TABLE]
for some natural number and parameters , and . The remainder term in (5) is treated as a nuisance. In particular, this remainder may still consist of infinite activity jumps. Our main result will require , such that the nuisance jumps are in a sense less active than the Lévy measure and asymptotically negligible. The parameters of the modeled part are summarized as
[TABLE]
where contains all parameter vectors as specified, such that additionally
[TABLE]
The value is of central importance. In particular, we need to impose the lower bound to ensure identifiability of the full parameter vector , see Aït-Sahalia and Jacod, (2012). Note that the definition (6) is the same as given by Jacod and Todorov, (2016) for the symmetric case.
In the high-frequency sampling setting considered here, we are given observations , with observation frequency such that is constant. Without loss of generality, let and . Equivalently, we observe the increments , which constitute a triangular array of random variables with iid rows. The law of is not fully described by the parameters due to the remainder in (5). Hence, we approximate it by a fully specified Lévy process with characteristic triplet . The process may be represented as
[TABLE]
where , , are independent Lévy processes, is a standard Brownian motion, and the are skewed -stable process with Lévy measure .
We suggest to estimate the parameter via the method of moments. In particular, we choose functions , , and a suitable scaling factor , and define to be a solution of the equation
[TABLE]
Here and in the following, denotes the expectation such that is determined by the parameter vector . Since is a fully parametric approximation of , the function can be be computed numerically, such that is a feasible estimator. To distinguish a generic parameter value from the parameters governing , we denote by the true parameter such that (5) holds.
To study the limit of , we employ the standard framework for estimating equations as reviewed by Jacod and Sørensen, (2018). Under the assumptions imposed below, we show that , up to negligible terms. In order for to have good asymptotic properties, the choices of the moment functions and the scaling factor are crucial. In particular, to derive a central limit theorem for (see Lemma 5.4), we need to control the sampling variance in (8) as well as the bias incurred by approximating by . Furthermore, the asymptotic behavior of as needs to be treated (see Lemma 5.5). To this end, the following properties turn out to be sufficient.
Condition** (F1).**
For , the functions satisfy for , and .
The smoothness imposed by Condition ** (F1).** is used to bound the bias incurred by approximating by , see Corollary 5.3 below. To control the sampling variance, we do not only require smoothness of the employed moment functions, but they further need to be of a specific shape.
Condition** (F2).**
The function is symmetric and satisfies . The functions , , are identically zero on the interval for some .
Additional identifiability conditions are specified in assumption ** (I).** below. The first moment function is approximately quadratic near zero, and will serve to identify the volatility . The functions are smooth thresholds, which distinguish the diffusion from the jump component. An example of suitable moment functions is given in section 4. To ensure that the threshold is effective, we require that in probability, i.e. . By choosing an appropriate scaling sequence as follows, the moments , , will be dominated by the jump component.
Condition** (U).**
such that for some .
Although potentially not sharp, the upper bound on the factor is required to derive our asymptotic result. For details, see the technical Lemma 5.1 below and the subsequent discussion. When choosing in accordance with condition ** (U).**, it suffices to use a reasonable upper bound on . Furthermore, the simulation results presented in section 4 show that larger values of also perform well in finite samples.
To formulate our main result on the asymptotic behavior of , we introduce the quantities
[TABLE]
which exist if . Furthermore, we introduce the matrices
[TABLE]
and the matrix , given by
[TABLE]
and for , ,
[TABLE]
These derivatives exist because are finite. Finally, we introduce the symmetric positive semidefinite matrix given by
[TABLE]
If clear from the context, we will omit the dependence on . Using this notation, we can formulate the remaining identifiability condition.
Condition** (I).**
For the true parameter , is regular.
Remark 1*.*
Analyzing the degrees of freedom of the equation suggests that condition ** (I).** is, in fact, the generic case. To demonstrate this point, we construct a set of moment functions satisfying the identifiability condition. Consider the case with and , . We can construct a set of moment functions satisfying condition ** (I).** as follows. Let and be symmetric functions satisfying conditions ** (F1).** such that , and vanishes on . Furthermore, denote , and . We set , and . Note that , as well as , and . Then one can check that
[TABLE]
with determinant . Hence, is regular for and all if is chosen such that . This is in particular the case for the choice of the moment functions for the simulation study in section 4.
The main result of this paper is the consistency and asymptotic normality of , as summarized by the following theorem.
Theorem 2.1**.**
Let be a Lévy process satisfying (5) with some , and parameter vector . Let satisfy assumptions ** (F1).** and ** (F2).**, and be such that is regular, and let be chosen according to ** (U).**. Then there exists a sequence of random vectors solving (8), such that in probability as . This sequence is eventually unique, and, as ,
[TABLE]
The resulting rate of convergence for the BG index is thus found to be , which improves upon existing estimators and matches the lower bound of Aït-Sahalia and Jacod, (2012) up to logarithmic factors. However, the rate matrix of Theorem 2.1 is non-diagonal. The phenomenon of a non-diagonal rate matrix has also been observed in the pure jump case, i.e. , see Brouste and Masuda, (2018). We further discuss this aspect and the resulting marginal rates of convergence for and in the next section. Nevertheless, the matrices , , and are block-diagonal, such that the volatility estimator is asymptotically independent of the estimator of the jump part.
The presented central limit theorem also holds for the fully specified case without nuisance, i.e. in (5). Even in this parametric case, we find that a simple GMM estimator based on fixed moment functions, corresponding to , will not achieve the best rate of convergence. A careful construction of the estimating equation (8) is thus not only required to handle the nuisance term, but also for the underlying parametric problem itself.
The proposed estimator for can be contrasted with existing methods in the literature. In an earlier study, Reiß, (2013) suggests a test procedure for the value of based on a statistic with tuning parameter . Therein, it is established that as at rate , and as . By inverting the function , this approach yields a near-optimal estimator for . The statistics are constructed based on nonlinear sample moments as in (8), where the are linear combinations of trigonometric functions, i.e. . Choosing the weights carefully such that for , Reiß, (2013) is able to reduce the variance of the corresponding sample moments. The arbitrarily small defect in the rate of convergence derived therein is thus due to the sampling variance. In contrast, by choosing the moment functions to vanish near zero according to Condition ** (F2).**, we obtain a smaller variance of the sample moments.
An alternative estimator achieving the rate is presented by Bull, (2016), which also uses functions which vanish near zero. Therein, the value is approximated by a finite series expansion, and extending this expansion reduces the rate defect . In contrast, we use the approximation . Although the latter value is not available in explicit form and needs to be determined numerically, this approach allows us to decrease the bias of the estimating equation further than by any finite series expansion. In particular, we only incur a bias due to approximating the Lévy measure of , but not due to a discretization of the time evolution of the process. Thus, our method effectively circumvents the variance issue of Reiß, (2013) and the bias issue of Bull, (2016). This allows us to eliminate the polynomial rate defect and achieve a faster rate of convergence.
3 Asymptotic optimality
It is natural to ask whether our proposed estimator is asymptotically optimal. From Theorem 2.1, we find that
[TABLE]
which matches the optimal estimator in the situation without jumps. That is, is efficient. In general, jumps of infinite variation reduce the achievable rate of convergence for volatility estimators (Jacod and Reiss,, 2014). Here, we are able to recover efficiency by modeling the infinite variation part of the jump measure explicitly via (5). The same methodology has been applied by Jacod and Todorov, (2014, 2016) to construct an efficient estimator of . Note that the latter studies treat more general types of semimartingales, while we only derived a result for Lévy processes. In contrast to the existing estimators, which use a multi-step debiasing procedure, we determine by a single set of estimating equations. While our approach is conceptually simple, solving the estimating equations (8) is computationally expensive. A comparison of the finite sample performance is presented in Section 4.
As the asymptotic variance of the estimators and depends on the choice of , they can not be expected to be variance efficient. Furthermore, they are coupled via and via the matrix , which is in general dense. Inspecting the limit in Theorem 2.1, we find that
[TABLE]
To assess these rates of convergence, we may compare with the lower bound of Aït-Sahalia and Jacod, (2012). Therein, the authors compute the diagonal terms of the Fisher information based on observations of for the symmetric case and . Their analysis of the diagonal entries and suggests that an asymptotically optimal estimator should satisfy
[TABLE]
Notably, even for , the rates (11) are faster than (10) by a logarithmic factor.
This difference could potentially be explained by the neglected off-diagonal terms of . A similar phenomenon occurs in the pure jump case , , where for any sequence of diagonal matrices , the limit of is singular, see (Masuda,, 2015, Thm. 3.4) and (Aït-Sahalia and Jacod,, 2008, Thm. 2). Recently, Brouste and Masuda, (2018) studied this case, and established the LAN property with a non-diagonal rescaling matrix . They find that the optimal rate of convergence is slower than suggested by the diagonal entries of the Fisher matrix, by a factor of . A similar phenomenon is observed when estimating the Hurst parameter of a fractional Brownian motion based on high-frequency observations (Brouste and Fukasawa,, 2018). There is no LAN result available for estimation of the BG index in the case , and a full investigation of the LAN property in the present case is out of scope of this paper. Nevertheless, we can adapt the proof of Aït-Sahalia and Jacod, (2012) to unveil the off-diagonal entries . It turns out that the diagonally rescaled Fisher matrix is asymptotically singular, just as in the pure-jump case.
Proposition 3.1**.**
Let denote the Fisher information matrix of with and , . Then, as ,
[TABLE]
In particular, the limiting matrix is singular.
The diagonal entries of the Fisher information matrix should match the optimal rates of convergence in the case where only a single parameter is unknown, e.g. if are known and should be estimated. In this situation, a natural version of our estimator is to consider only a single moment function . Analogous to (8), for any , we may estimate as the solution of
[TABLE]
With a slight abuse of notation, we may also estimate by the equation . To distinguish jumps and diffusion, we suppose satisfies the same conditions as , i.e. it should vanish around zero.
Proposition 3.2**.**
Let be a Lévy process satisfying (5) with some , and parameter vector . Let be a non-negative function satisfying ** (F1).**, and for , and choose such that ** (U).** holds. Fix some , and suppose that . Then there exists a consistent sequence of estimators satisfying , such that in probability as , and
[TABLE]
Under the same conditions, and if all parameters except for resp. are known, there exists a consistent sequence of estimators solving such that, as ,
[TABLE]
Since is of order , Proposition 3.2 establishes precisely the rates (11). In the setting of Aït-Sahalia and Jacod, (2012), in particular , this shows that resp. are rate efficient if the remaining parameters are known. In contrast, if all parameters are unknown, achieves the optimal rate of convergence, up to a logarithmic factor. Due to the singularity of the Fisher matrix, we conjecture that the achieved rates (10) are in fact optimal.
4 Simulation study
By means of a Monte Carlo study, we compare the finite sample performance of our estimator with the estimators of Reiß, (2013) and Bull, (2016) for the Blumenthal-Getoor index , and with the volatility estimator of Jacod and Todorov, (2014). To this end, we sample paths of a Lévy process given by
[TABLE]
We denote by the -stable Lévy motion with skewness parameter . That is, the characteristic function of is given by (see e.g.Zolotarev, (1986))
[TABLE]
The Lévy measure corresponding to this standardization can be expressed in the form (6) with , , and if . Here, we will set and study the cases and . Then (5) is satisfied with , such that is a nuisance term, and . In view of applications in financial econometrics, we consider the time horizon , and sampling frequencies ,, and . This sampling schemes correspond to resp. resp. seconds per quote on a trading day of hours.
To determine the solution of the estimating equation (8), we need to compute the moments and their gradients. This can be done numerically by means of a continuous Fourier transform since is available in closed form. The employed moment functions are handcrafted to satisfy ** (F1).** and ** (F2).**. In our simulations, we use
[TABLE]
Note that vanish on . We use the rescaling factor . Although this choice of is too large to comply with assumption ** (U).**, we found it to perform better than smaller values for the given sampling scenario.
The methods of Reiß, (2013) and Bull, (2016) each have a tuning parameter , and larger values of increase the rate of convergence. However, smaller values of can be superior in finite samples. In our simulations, we found that the estimator of Bull performed best when setting , and the estimator of Reiß performed best when setting , across all observation frequencies. Furthermore, the method of Reiss involves a rescaling parameter and two weighting measures , . We choose the weighting measure to be supported on the set , and to be supported on the set . The truncation parameter is set to , as suggested by equation (3.8) therein.
In Table 1, we compare the simulated performance of our moment estimator for and with the estimators of Jacod and Todorov, (2014), Reiß, (2013), and Bull, (2016). For the latter two, we choose the best tuning parameter as specified above. The estimator of Jacod and Todorov, (2014) is implemented as in equation (5.3) therein, with and . It is found that the new estimators perform best in the considered setting The good performance of the estimator of Reiß in the case is somewhat surprising, since the analysis of Reiß, (2013) only yields a suboptimal rate of convergence. However, for the latter estimator, no central limit theorem is available. Hence, it is possible that the estimator in fact converges at a rate which is faster than the rate derived by Reiß, (2013). It should also be noted that all benchmarked methods require various tuning parameters. Most notably, all methods require some form of scaling factors. Furthermore, our new estimator depends on the the employed moment functions , and the estimator of Bull, (2016) requires the choice of a truncation kernel function. It is thus possible that a very careful choice of these parameters might affect the ranking implied by Table 1.
The volatility estimator is efficient, and from (9), the error should be of order . From the results of Table 1, we find that this asymptotic performance is not achieved for the considered sample sizes. This defect holds for our proposed estimator as well as for the benchmark method of Jacod and Todorov, (2014), and it is bigger for large values of . This is potentially due to the relatively large jump component of the simulated process (13). On the other hand, the asymptotic distribution of Theorem 2.1 yields a good approximation of the finite sample behavior of , as shown in Figure 1. Clearly, the match with the asymptotic normal distribution improves for smaller . Furthermore, the approximation is better for the smaller value .
5 Technical tools
In this section, we present the proofs of Theorem 2.1 and Propositions 3.1 and 3.2. Preliminary technical results are presented in Subsection 5.1, as they might be of independent interest, in particular Lemma 5.1 and Corollary 5.3. The detailed proofs are presented in Subsection 5.2.
5.1 Preliminary results
To study the asymptotic behavior of the estimating equation (8) by standard techniques (see e.g. Jacod and Sørensen, (2018)), we need
- •
a central limit theorem for the term , and
- •
properties of the derivatives .
To determine asymptotic variances, as well as for some technical steps of the following proofs, it is useful to derive some explicit approximations of .
Lemma 5.1**.**
Let be such that and are bounded and , and let be a Lévy process with characteristic triplet . The implicit constants in the following expressions depend on and , but neither on nor on . Moreover, all and terms are bounded resp. vanishing uniformly on compacts in .
- (i)
If for , then for any such that , as ,
[TABLE]
where
[TABLE] 2. (ii)
*If, alternatively, but , then for any *
[TABLE] 3. (iii)
If but , and are bounded, then for any
[TABLE] 4. (iv)
If and , , then there exists a constant bounded uniformly on compacts, such that for all and all , ,
[TABLE]
The case (i), which is exploited in the proofs several times, imposes a subtle upper bound on . Although this bound need not be sharp, the Lemma will not hold for if is too large. To make this plausible, note that for an -stable process , the probability tends to zero as , roughly polynomially in . On the other hand, for the Brownian motion, polynomially as well, but the polynomial order of this decay will depend on the specific value of . For the jump term to dominate, as in case (i) of Lemma 5.1, must be small. The uniformity w.r.t. of the previous results will be used later on to derive the consistency of the estimator.
Another ingredient to obtain a central limit theorem is a bias bound, i.e. a bound on the error of approximating by . For two random variables and , recall the definition of the 1-Wasserstein metric and the total variation distance given by
[TABLE]
where the supremum is taken over all bounded resp. Lipschitz continuous, measurable functions . These distances are used in the proof of the following Lemma, which quantifies the error of approximation implied by the local stability assumption (5).
Lemma 5.2**.**
Let be two Lévy processes with characteristic triplets and , respectively. Suppose furthermore that for some ,
[TABLE]
There exists a constant depending on , , and , such that for any differentiable function , and any ,
[TABLE]
where . The constant is bounded on compacts in , , and .
Corollary 5.3**.**
Let such that are bounded and . Let be two Lévy processes with characteristic triplets and , respectively. Suppose that , satisfy the conditions of Lemma 5.2. Then, as ,
[TABLE]
The constant is bounded on compacts in , , , and .
Note that the presented result of 5.3 can not be directly formulated in terms of or , distinguishing it from the results of Mariucci and Reiß, (2018). An alternative bound on the total variation distance between and is presented by (Clément and Gloter,, 2018, Proposition 4) and (Amorino and Gloter,, 2019, Proposition 2), stating that as . Their assumptions on the Lévy measure imply that our condition (5) holds, with . Thus, if and , our bound (16) is sharper since . In the case , our bound is of the same order of magnitude as the one presented by Clément and Gloter, (2018) and Amorino and Gloter, (2019). Furthermore, our result may also be applied in the case . However, we impose additional smoothness assumptions upon the considered function , which is suitable for our statistical purposes because the moment functions are chosen by the statistician.
To state the remaining technical results, introduce the notation
[TABLE]
such that
[TABLE]
Corollary 5.3 and Lemma 5.1 allow us to derive the following central limit theorem for the estimated moments. In particular, we use Lemma 5.1 to control the sampling variance, and Corollary 5.3 to control the bias.
Lemma 5.4**.**
Let constant, i.e. , and choose according to ** (U).**. Let satisfy ** (F1).** and ** (F2).**, and suppose that the Lévy process satisfies (5) with some . Then, as ,
[TABLE]
Note that the rate of convergence for the first moment is slower than for . This is due to our special choice of , which vanish near zero. Hence, these moments are primarily driven by the jump component, which is of a smaller order than the diffusion term. On the other hand, the jump parameters are harder to identify, i.e. . This is established in the following Lemma.
Lemma 5.5**.**
Let be such that are bounded. Let be a Lévy process with characteristic triplet , parameterized by as in (7). Then, as , , such that ,
[TABLE]
and,
[TABLE]
Moreover, if vanishes on and satisfies Condition ** (U).**,
[TABLE]
All terms of the form and are bounded resp. vanishing uniformly on compacts in .
Corollary 5.6**.**
Let satisfy ** (F1).** and ** (F2).**, and let be a Lévy process with characteristic triplet , parameterized by as in (7). Then, as , , such that ,
[TABLE]
This convergence holds uniformly on compacts in .
These results allow us to establish the consistency of . We do not consider global uniqueness of the solution of the estimating equation (8). Hence, we only obtain the existence of a consistent sequences of random variables satisfying the equation.
Lemma 5.7** (Consistency).**
Let be a Lévy process satisfying (5) with some , and parameter vector . Let satisfy assumptions ** (F1).**, ** (F2).**, and ** (I).**, and let be chosen according to ** (U).**. There exists a sequence of random vectors solving (8), such that in probability as . This sequence is eventually unique, i.e. for any other consistent sequence solving the estimating equation, it holds .
To obtain a central limit theorem for , we may apply a Taylor expansion to obtain the representation
[TABLE]
where for some on the line segment between and , for . This standard approach allows to establish Theorem 2.1, as detailed in Subsection 5.2.
5.2 Proofs
Proof of Lemma 5.1.
At the price of changing the term , we may assume w.l.o.g. that . In view of the Lévy-Itô decomposition (4), we write
[TABLE]
where is a Poisson counting measure with intensity , and denotes the corresponding integral term. The explicit form of allows for computation of , as
[TABLE]
The term is added to cover the case . This bound on will be used in the sequel.
To derive the claims of the Lemma, we start with a rough bound for the probability
[TABLE]
The first term tends to zero identically as . To study the jump term, choose a bounded, smooth function such that . Then by Itô’s formula, and a substitution in the integral, we obtain
[TABLE]
for a constant depending on and is bounded on compacts in these parameters. The function can be chosen such that the latter term is finite. Thus, , uniformly on compacts in .
For the Gaussian term in (21), we employ the tail bound
[TABLE]
Now let be such that . Then
[TABLE]
If , i.e. , the latter bound is of order less than , uniformly on compacts. In particular,
[TABLE]
Note that the latter inequality does not hold if for a proportionality factor which is too large.
If is larger, but ,the bound on remains unchanged, while we still obtain uniformly on compacts. Thus, if we only suppose , we have uniformly on compacts, for any , but with a slower rate.
To obtain an asymptotically exact value, we plug the former rough bound into Itô’s formula. In case (i), we have
[TABLE]
Here, we used as vanishes on . We moreover used that , and as established previously. These upper bounds hold uniformly on compacts in . To proceed, note that is a bounded continuous function, since
[TABLE]
which is furthermore bounded uniformly on compacts in . By virtue of this boundedness, implies . To ensure that this last approximation holds uniformly on compacts in , note that is also bounded, such that it suffices to control uniformly. But we already established that for any , uniformly on compacts in . Hence,
[TABLE]
uniformly on compacts in . This proves the first claim.
If, on the other hand, , a different term dominates in (22). We obtain
[TABLE]
uniformly on compacts in .
For the case , we may apply the result of case (ii) to obtain , and hence
[TABLE]
For the last claim, we use Itô’s formula again. Recall that the truncation function satisfies for , and . Then
[TABLE]
The additional factor is introduced to cover the special case when computing the integral . ∎
Proof of Lemma 5.2.
Choose some . The process may be decomposed by virtue of the Lévy-Itô decomposition as
[TABLE]
where is a compensated homogeneous Poisson point process with intensity measure , such that is a martingale. For , we have the analogous decomposition . Moreover,
[TABLE]
The second integral is finite. Furthermore, integrating by parts,
[TABLE]
which has a limit as if . Thus, there exists a real number such that as .
By subadditivity of the total variation distance and the Wasserstein distance,
[TABLE]
We treat all terms in (LABEL:eqn:expect-diff) individually.
Part (i) The small jumps can be handled by noting
[TABLE]
Since and have bounded jumps, we have as . Furthermore, as .
Part (ii) As a next step, we study the medium sized jumps . Consider the slightly more general process
[TABLE]
for . Let be defined analogously based on . These are compound Poisson processes, which can be written as
[TABLE]
where is a Poisson counting process with intensity , and the are iid random variables with distribution . Vice versa, the same holds for and with . Then Theorem 10 and Proposition 3 of Mariucci and Reiß, (2018) for , yield
[TABLE]
We compute
[TABLE]
Recall that . Then there exists a constant which is bounded on compacts in and , such that for , and ,
[TABLE]
In particular, this yields for a potentially different constant . here and in the following, the constant may vary from line to line, and is bounded on compacts in , , and .
Furthermore, since and are sufficiently similar,
[TABLE]
for . Thus, the second term in (25) is of order . Moreover, for small , since .
We now consider the distance occurring in (25), which can be expressed in terms of their cumulative distribution functions as
[TABLE]
For , and , it holds
[TABLE]
Recall that . Furthermore, the assumed similarity of and implies that , and
[TABLE]
as , whenever . In this case, for ,
[TABLE]
The analogous bound holds for , when . Now plug (31) into expression (LABEL:eqn:Wasserstein-U) for the Wasserstein distance, to obtain for and ,
[TABLE]
where we used . Using (25), we may hence bound,
[TABLE]
This upper bound will be exploited in the rest of the proof. In particular, for and small enough,
[TABLE]
Part (iii) It remains to study the term in (LABEL:eqn:expect-diff) due to the large jumps. Here, our approach is slightly different as we will not (only) bound a metric distance between and . Define
[TABLE]
and we consider , as suggested by (LABEL:eqn:expect-diff). Since is a Lévy process, Itô’s formula yields
[TABLE]
i.e., is the infinitesimal generator of . Analogously, we denote by the generator of . Then integration by parts yields, for any ,
[TABLE]
The same bound holds for the range of integration , such that
[TABLE]
Now note that,
[TABLE]
such that by Fubini’s theorem,
[TABLE]
where we performed a linear substitution in the second step. Hence,
[TABLE]
Using this in (34),
[TABLE]
We now study the latter two terms.
Part (iv) The total variation distance can be bounded by noting that and admit only finitely many jumps. The number of their jumps is Poisson distributed, such that
[TABLE]
In particular,
[TABLE]
Moreover,
[TABLE]
Via the same argument, we also obtain
[TABLE]
From (32), we know that
[TABLE]
In combination with (LABEL:eqn:Ef-J3), we thus obtain
[TABLE]
Part (v) Now putting (24), (33), and (40) into (LABEL:eqn:expect-diff), and letting ,
[TABLE]
It can be checked that the upper bounds which are summarized in the constant all satisfy the desired uniformity on compacts in , , , and . This concerns the lines (26), (29), (30), (35), (37), (38), (39). ∎
Proof of Corollary 5.3.
Assume without loss of generality. A Taylor expansion yields, for any ,
[TABLE]
We denote , where is the purely discontinuous component of . Introduce for any function the notation . Then for any -th derivative, . In particular, by Lemma 5.1,
[TABLE]
such that
[TABLE]
Moreover, from Lemma 5.2. Applying (42) for the drift , this yields (16). ∎
Proof of Lemma 5.4.
All summands are iid and bounded and , such that the Lindeberg-Feller condition for triangular arrays of independent r.v.s is satisfied (Durrett,, 2005, Thm. 2.4.5). Moreover, the bias is of order by Corollary 5.3. If , this is small enough to ensure . Hence, the bias is asymptotically negligible.
It thus suffices to check the asymptotic covariance structure. Denote . Then is smooth and vanishes on unless . Moreover, and . Corollary 5.3 and Lemma 5.1 yield
[TABLE]
To compute the asymptotic covariance, we further determine
[TABLE]
and for ,
[TABLE]
These approximations can be summarized as
[TABLE]
This scaling behavior yields as , and thus the desired central limit theorem. ∎
Proof of Lemma 5.5.
First, assume to be a Schwartz function with Fourier transform . Then
[TABLE]
where is the Lévy symbol of , i.e. . In particular, for any entry of the parameter vector ,
[TABLE]
Integration and differentiation may be exchanged because is a Schwartz function and has polynomial growth. In particular, via the Lévy-Khintchine formula, the Lévy symbol may be determined as
[TABLE]
The second term appears because the Lévy measure is allowed to be asymmetric. In its expression, we used that for , and denote
[TABLE]
Hence, by inverting the Fourier transform,
[TABLE]
So far, we assumed to be a Schwartz function, but the right hand side of (44) makes sense whenever . We can extend the whole equation (44) to this case by approximating suitably with a sequence of Schwartz functions , such that as for each , and , and . Hence, standard arguments allow us to pass to the limit on both sides of the equation (44)
To handle the asymmetry term , we exploit (43) to derive
[TABLE]
The second integral can be bounded as follows. For any and any , there is a between and such that
[TABLE]
By continuity, the same bound holds for . Thus, we obtain
[TABLE]
Similarly,
[TABLE]
Note also that .
For specific partial derivatives, we thus have shown that
[TABLE]
For fixed , the functions , and are bounded, uniformly on compacts in . Moreover, uniformly on compacts in for any , as established in the proof of Lemma 5.1. Therefore, uniformly on compacts as , as well as and . This completes the proof of (17), and (18) follows analogously by applying a linear transformation to (45). Finally, (19) is a consequence of (45) upon noting that , see Lemma 5.1. ∎
Proof of Corollary 5.6.
Since is bounded, (17) shows that
[TABLE]
This corresponds to the entries for . For , we have by virtue of Lemma 5.1, since vanishes near zero. Hence, since and ,
[TABLE]
This corresponds to the entries for . In combination with Lemma 5.5, this suffices to establish the convergence (20). ∎
Proof of Lemma 5.7.
Denote the estimating equation (8) as , for
[TABLE]
Let be the true parameters, and reparameterize for , and let
[TABLE]
This is well defined whenever , for , and sufficiently small. In this reparameterized model, we need to show that there exists a sequence of random vectors such that for large , and . This will imply that for a sufficiently large factor .
We know from Lemma 5.4 that
[TABLE]
Furthermore,
[TABLE]
By Corollary 5.6, locally uniformly, and it can be checked that is continuous. Moreover, the definitions of , and readily yield, as ,
[TABLE]
Here, we denote by the spectral norm of a matrix, i.e. is the largest absolute eigenvalue of the symmetrized matrix , and denotes the identity matrix. Thus,
[TABLE]
Now we apply (Jacod and Sørensen,, 2018, Lemma 6.2) to establish the existence of a solution of the equation . Let , and denote by the event
[TABLE]
Since the first set is deterministic, and since , we have . On the set , it holds that . Then Lemma 6.2 of Jacod and Sørensen, (2018) with and , states that there exists a unique point which solves .
Returning to the original parametrization, we conclude there exists a random variable such that with probability at least , solves the estimating equation and , i.e. . Theorem 2.1 below establishes that any consistent sequence converges at a rate faster than , such that eventually. Hence, the uniqueness of on implies the uniqueness of , i.e. . ∎
Proof of Theorem 2.1.
Denote the estimating equation as , for as in (46). The mean value theorem yields
[TABLE]
where for some on the line segment between and . Denote by the event that is regular, and introduce furthermore the matrices
[TABLE]
That is, the -th row of and coincide, . Now note that , and for any , as in (47),
[TABLE]
Together with the locally uniform convergence of Corollary 5.6, this yields for each , and thus .
In particular, , and on the set , we may rewrite
[TABLE]
But by Lemma 5.4, and in probability, such that Slutsky’s lemma completes the proof. ∎
Proof of Proposition 3.1.
We show how to adjust the proof of Aït-Sahalia and Jacod, (2012) to consider the off-diagonal entries. Denote by the density of a symmetric -stable random variable, standardized to have Lévy measure . This is the same parametrization as implied by (6). Furthermore, let be the density of a standard normal distribution. Then the probability density of is given by the convolution
[TABLE]
Now introduce the terms
[TABLE]
and
[TABLE]
Some technical integral transformations, explained in more detail by Aït-Sahalia and Jacod, (2012) (cf. (A.3) therein), establish that
[TABLE]
The main workload of the proof given by Aït-Sahalia and Jacod, (2012) derives the limiting behavior of as . They show that
[TABLE]
where
[TABLE]
Using furthermore that , this yields
[TABLE]
Some straightforward manipulations show that
[TABLE]
This limiting matrix is singular. The off-diagonal entry has not been considered by Aït-Sahalia and Jacod, (2012). ∎
Proof of Proposition 3.2.
Denote the true parameter by and , respectively. By Lemma 5.5, we have as , ,
[TABLE]
This convergence holds uniformly on compacts in . The limits are positive because by the definition of , and by assumption. Moreover, Lemma 5.4 also holds for , i.e.
[TABLE]
Thus, the existence of a consistent sequence of estimators follows along the same lines as Lemma 5.7.
For the central limit theorem, we use the mean value theorem to obtain, for a value between and ,
[TABLE]
In particular, . Just as in the proof of Theorem 2.1, we may use the convergence of and the central limit theorem (49) to derive the asymptotic distribution of by means of Slutsky’s Lemma. Analogously for . ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Aït-Sahalia and Jacod, (2008) Aït-Sahalia, Y. and Jacod, J. (2008). Fisher’s information for discretely sampled Lévy processes. Econometrica , 76(4):727–761.
- 2Aït-Sahalia and Jacod, (2009) Aït-Sahalia, Y. and Jacod, J. (2009). Estimating the degree of activity of jumps in high frequency data. The Annals of Statistics , 37(5A):2202–2244.
- 3Aït-Sahalia and Jacod, (2012) Aït-Sahalia, Y. and Jacod, J. (2012). Identifying the successive Blumenthal–Getoor indices of a discretely observed process. The Annals of Statistics , 40(3):1430–1464.
- 4Amorino and Gloter, (2018) Amorino, C. and Gloter, A. (2018). Contrast function estimation for the drift parameter of ergodic jump diffusion process. ar Xiv preprint , 1807.08965.
- 5Amorino and Gloter, (2019) Amorino, C. and Gloter, A. (2019). Unbiased truncated quadratic variation for volatility estimation in jump diffusion processes. ar Xiv preprint , 1904.10660.
- 6Andersen et al., (2002) Andersen, T. G., Benzoni, L., and Lund, J. (2002). An empirical investigation of continuous-time equity return models. The Journal of Finance , 57(3):1239–1284.
- 7Blumenthal and Getoor, (1961) Blumenthal, R. M. and Getoor, R. K. (1961). Sample functions of stochastic processes with stationary independent increments. Journal of Mathematics and Mechanics , 10(3):493–516.
- 8Brouste and Fukasawa, (2018) Brouste, A. and Fukasawa, M. (2018). Local asymptotic normality property for fractional Gaussian noise under high-frequency observations. The Annals of Statistics , 46(5):2045–2061.
