Rate-optimal estimation of the Blumenthal-Getoor index of a L\'evy   process

Fabian Mies

arXiv:1906.08062·math.ST·June 20, 2019

Rate-optimal estimation of the Blumenthal-Getoor index of a L\'evy process

Fabian Mies

PDF

TL;DR

This paper introduces a new estimator for the Blumenthal-Getoor index of Lévy processes that achieves the optimal convergence rate, improving upon existing methods especially when a diffusion component is present.

Contribution

The paper proposes a novel, rate-optimal estimator for the BG index and related parameters, applicable even with infinite variation jumps, using the generalized method of moments.

Findings

01

Estimator attains the optimal convergence rate.

02

Method effectively estimates parameters jointly.

03

Simulation shows superior finite sample performance.

Abstract

The Blumenthal-Getoor (BG) index characterizes the jump measure of an infinitely active L\'evy process. It determines sample path properties and affects the behavior of various econometric procedures. If the process contains a diffusion term, existing estimators of the BG index based on high-frequency observations only achieve rates of convergence which are suboptimal by a polynomial factor. In this paper, a novel estimator for the BG index and the successive BG indices is presented, attaining the optimal rate of convergence. If an additional proportionality factor needs to be inferred, the proposed estimator is rate-optimal up to logarithmic factors. Furthermore, our method yields a new efficient volatility estimator which accounts for jumps of infinite variation. All parameters are estimated jointly by the generalized method of moments. A simulation study compares the finite sample…

Tables1

Table 1. Table 1 : Median absolute errors for the estimation of α 𝛼 \alpha and σ 2 superscript 𝜎 2 \sigma^{2} in model ( 13 ), for different estimators. All values are based on 20000 20000 20000 simulations.

$α$	$h = n^{- 1}$	GMM ${\hat{σ}}^{2}$	JT ${\hat{σ}}^{2}$	GMM $\hat{α}$	Reiß $\hat{α}$	Bull $\hat{α}$
1.3	5/23400	0.04	0.07	0.19	0.28	0.59
1.3	1/23400	0.02	0.03	0.13	0.17	0.37
1.3	0.2/23400	0.007	0.010	0.08	0.10	0.25
1.7	5/23400	0.32	0.43	0.23	0.22	0.31
1.7	1/23400	0.16	0.22	0.11	0.11	0.30
1.7	0.2/23400	0.08	0.10	0.06	0.06	0.25

Equations349

α = in f {p : s \leq T \sum ∣Δ X_{s} ∣^{p} < \infty} .

α = in f {p : s \leq T \sum ∣Δ X_{s} ∣^{p} < \infty} .

U (τ_{n}) = i = 1 \sum n \mathds 1 (X_{\frac{i}{n}} - X_{t_{\frac{i - 1}{n}}} > τ_{n}) .

U (τ_{n}) = i = 1 \sum n \mathds 1 (X_{\frac{i}{n}} - X_{t_{\frac{i - 1}{n}}} > τ_{n}) .

ν (d z) \approx \tilde{ν} (d z) = m = 1 \sum M \frac{α _{m}}{∣ z ∣ ^{1 + α_{m}}} (r_{m}^{+} \mathds 1_{z > 0} + r_{m}^{-} \mathds 1_{z < 0}) d z .

ν (d z) \approx \tilde{ν} (d z) = m = 1 \sum M \frac{α _{m}}{∣ z ∣ ^{1 + α_{m}}} (r_{m}^{+} \mathds 1_{z > 0} + r_{m}^{-} \mathds 1_{z < 0}) d z .

X_{t} = μ t + σ B_{t} + \int_{0}^{t} \int (z - ξ (z)) N (d z, d s) + \int_{0}^{t} \int ξ (z) (N (d z, d s) - ν (d z) \otimes d s),

X_{t} = μ t + σ B_{t} + \int_{0}^{t} \int (z - ξ (z)) N (d z, d s) + \int_{0}^{t} \int ξ (z) (N (d z, d s) - ν (d z) \otimes d s),

∣ ν ([x, \infty)) - \tilde{ν} ([x, \infty)) ∣ ∣ ν (- \infty, x]) - \tilde{ν} (- \infty, x]) ∣ \leq L ∣ x ∣^{- ρ}, x \in (0, 1], \leq L ∣ x ∣^{- ρ}, x \in [- 1, 0),

∣ ν ([x, \infty)) - \tilde{ν} ([x, \infty)) ∣ ∣ ν (- \infty, x]) - \tilde{ν} (- \infty, x]) ∣ \leq L ∣ x ∣^{- ρ}, x \in (0, 1], \leq L ∣ x ∣^{- ρ}, x \in [- 1, 0),

\tilde{ν} (d z) = m = 1 \sum M \frac{α _{m}}{∣ z ∣ ^{1 + α_{m}}} (r_{m}^{+} \mathds 1_{z > 0} + r_{m}^{-} \mathds 1_{z < 0}) d z,

\tilde{ν} (d z) = m = 1 \sum M \frac{α _{m}}{∣ z ∣ ^{1 + α_{m}}} (r_{m}^{+} \mathds 1_{z > 0} + r_{m}^{-} \mathds 1_{z < 0}) d z,

θ = (σ^{2}, α_{1}, r_{1}^{+}, r_{1}^{-}, \dots, α_{M}, r_{M}^{+}, r_{M}^{-}) \in Θ \subset R^{3 M + 1} .

θ = (σ^{2}, α_{1}, r_{1}^{+}, r_{1}^{-}, \dots, α_{M}, r_{M}^{+}, r_{M}^{-}) \in Θ \subset R^{3 M + 1} .

α = α_{1} > α_{2} > \dots > α_{M} > \frac{α}{2}, r_{m}^{+} + r_{m}^{-} > 0, i = 1, \dots, M, σ^{2} \geq 0.

α = α_{1} > α_{2} > \dots > α_{M} > \frac{α}{2}, r_{m}^{+} + r_{m}^{-} > 0, i = 1, \dots, M, σ^{2} \geq 0.

\tilde{Z}_{t} = σ B_{t} + m = 1 \sum M S_{t}^{m},

\tilde{Z}_{t} = σ B_{t} + m = 1 \sum M S_{t}^{m},

F_{n} (\hat{θ}_{n}) = [\frac{1}{n} i = 1 \sum n f (u_{n} Δ_{n, i} X)] - E_{\hat{θ}_{n}} f (u_{n} \tilde{Z}_{h_{n}}) =! 0.

F_{n} (\hat{θ}_{n}) = [\frac{1}{n} i = 1 \sum n f (u_{n} Δ_{n, i} X)] - E_{\hat{θ}_{n}} f (u_{n} \tilde{Z}_{h_{n}}) =! 0.

J_{α}^{\pm} g (x)

J_{α}^{\pm} g (x)

γ_{n, m} (θ)

γ_{n, m} (θ)

Γ_{n} (θ)

\overset{ˉ}{Λ}_{n} (θ)

\dots, u_{n}^{α_{M - 1} - \frac{α _{1}}{2}}, u_{n}^{α_{M} - \frac{α _{1}}{2}}, u_{n}^{α_{M} - \frac{α _{1}}{2}}, u_{n}^{α_{M} - \frac{α _{1}}{2}}),

A (θ)_{1, 1} = f_{1}^{''} (0) /2, A (θ)_{1, j} = A (θ)_{j, 1} = 0, j \neq = 1,

A (θ)_{1, 1} = f_{1}^{''} (0) /2, A (θ)_{1, j} = A (θ)_{j, 1} = 0, j \neq = 1,

A (θ)_{j, 3 m - 1}

A (θ)_{j, 3 m - 1}

A (θ)_{j, 3 m}

Σ (θ)_{1, 1}

Σ (θ)_{1, 1}

Σ (θ)_{1, j}

Σ (θ)_{j, k}

A (θ_{0})

A (θ_{0})

= f^{''} (0) /2 000 0 (r^{+} + r^{-}) b 2^{α} (r^{+} + r^{-}) (b + a lo g 2) r^{+} b + r^{-} a 2^{α} lo g 2 0 a 2^{α} a a 0 a 2^{α} a a (1 + 2^{α}),

n A (θ_{0}) \overset{ˉ}{Λ}_{n} Γ_{n}^{- 1} (θ_{0}) (\hat{θ}_{n} - θ_{0}) \Rightarrow N (0, Σ (θ_{0})) .

n A (θ_{0}) \overset{ˉ}{Λ}_{n} Γ_{n}^{- 1} (θ_{0}) (\hat{θ}_{n} - θ_{0}) \Rightarrow N (0, Σ (θ_{0})) .

n (\overset{σ}{^}_{n}^{2} - σ^{2}) \Rightarrow N (0, 2 σ^{4}),

n (\overset{σ}{^}_{n}^{2} - σ^{2}) \Rightarrow N (0, 2 σ^{4}),

\overset{α}{^}_{m} - α_{m} \overset{r}{^}_{m}^{\pm} - r_{m}^{\pm} = O_{P} (u^{\frac{α _{1}}{2} - α_{m}}) = O_{P} ((n lo g n)^{\frac{α _{1}}{4} - \frac{α _{m}}{2}}), = O_{P} (u^{\frac{α _{1}}{2} - α_{m}} lo g u) = O_{P} ((n lo g n)^{\frac{α _{1}}{4} - \frac{α _{m}}{2}} lo g (n)) .

\overset{α}{^}_{m} - α_{m} \overset{r}{^}_{m}^{\pm} - r_{m}^{\pm} = O_{P} (u^{\frac{α _{1}}{2} - α_{m}}) = O_{P} ((n lo g n)^{\frac{α _{1}}{4} - \frac{α _{m}}{2}}), = O_{P} (u^{\frac{α _{1}}{2} - α_{m}} lo g u) = O_{P} ((n lo g n)^{\frac{α _{1}}{4} - \frac{α _{m}}{2}} lo g (n)) .

\overset{α}{^}_{m}^{*} - α_{m} \overset{r}{^}_{m}^{*} - r_{m} = O_{P} ((n lo g n)^{\frac{α _{1}}{4} - \frac{α _{m}}{2}} / lo g n), = O_{P} ((n lo g n)^{\frac{α _{1}}{4} - \frac{α _{m}}{2}}) .

\overset{α}{^}_{m}^{*} - α_{m} \overset{r}{^}_{m}^{*} - r_{m} = O_{P} ((n lo g n)^{\frac{α _{1}}{4} - \frac{α _{m}}{2}} / lo g n), = O_{P} ((n lo g n)^{\frac{α _{1}}{4} - \frac{α _{m}}{2}}) .

\frac{( h lo g ( 1/ h ) ) ^{\frac{α}{2}}}{h} (10 0 \frac{1}{l o g ( 1/ h )}) (I_{h}^{r, r} I_{h}^{r, α} I_{h}^{r, α} I_{h}^{α, α}) (10 0 \frac{1}{l o g ( 1/ h )})

\frac{( h lo g ( 1/ h ) ) ^{\frac{α}{2}}}{h} (10 0 \frac{1}{l o g ( 1/ h )}) (I_{h}^{r, r} I_{h}^{r, α} I_{h}^{r, α} I_{h}^{α, α}) (10 0 \frac{1}{l o g ( 1/ h )})

⟶ \frac{2 r}{σ ^{α} ( 2 - α ) ^{\frac{α}{2}}} (\frac{1}{r ^{2}} \frac{1}{2 r} \frac{1}{2 r} \frac{1}{4}) .

\tilde{F}_{n} (α_{m}) = \frac{1}{n} i = 1 \sum n f (u_{n} Δ_{n, i} X) - E_{θ} f (u_{n} \tilde{Z}_{h}) =! 0.

\tilde{F}_{n} (α_{m}) = \frac{1}{n} i = 1 \sum n f (u_{n} Δ_{n, i} X) - E_{θ} f (u_{n} \tilde{Z}_{h}) =! 0.

u_{n}^{α_{m} - \frac{α _{1}}{2}} lo g (u_{n}) (\overset{α}{^}_{m} - α_{m}) \Rightarrow N (0, \frac{( r _{1}^{+} J _{α_{1}} + r _{1}^{-} J _{α_{1}} ) f ^{2} ( 0 )}{( r _{m}^{+} J _{α_{m}}^{+} + r _{m}^{-} J _{α_{m}}^{-} ) f ( 0 )}) .

u_{n}^{α_{m} - \frac{α _{1}}{2}} lo g (u_{n}) (\overset{α}{^}_{m} - α_{m}) \Rightarrow N (0, \frac{( r _{1}^{+} J _{α_{1}} + r _{1}^{-} J _{α_{1}} ) f ^{2} ( 0 )}{( r _{m}^{+} J _{α_{m}}^{+} + r _{m}^{-} J _{α_{m}}^{-} ) f ( 0 )}) .

u_{n}^{α_{m} - \frac{α _{1}}{2}} (\overset{r}{^}_{m}^{\pm} - r_{m}^{\pm}) \Rightarrow N (0, \frac{( r _{1}^{+} J _{α_{1}} + r _{m}^{-} J _{α_{1}} ) f ^{2} ( 0 )}{J _{α_{m}}^{\pm} f ( 0 )}) .

u_{n}^{α_{m} - \frac{α _{1}}{2}} (\overset{r}{^}_{m}^{\pm} - r_{m}^{\pm}) \Rightarrow N (0, \frac{( r _{1}^{+} J _{α_{1}} + r _{m}^{-} J _{α_{1}} ) f ^{2} ( 0 )}{J _{α_{m}}^{\pm} f ( 0 )}) .

X_{t} = B_{t} + S_{t}^{α, β} + 0.1 S_{t}^{0.5, 0} .

X_{t} = B_{t} + S_{t}^{α, β} + 0.1 S_{t}^{0.5, 0} .

lo g E exp (i λ S_{t}^{α, β}) = - t ∣ λ ∣^{α} [1 - i tan (\frac{π α}{2}) β sign (λ)] .

lo g E exp (i λ S_{t}^{α, β}) = - t ∣ λ ∣^{α} [1 - i tan (\frac{π α}{2}) β sign (λ)] .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Rate-optimal estimation of the Blumenthal–Getoor index of a Lévy process

Fabian Mies111RWTH Aachen University, Institute of Statistics, [email protected]

Abstract

The Blumenthal–Getoor (BG) index characterizes the jump measure of an infinitely active Lévy process. It determines sample path properties and affects the behavior of various econometric procedures. If the process contains a diffusion term, existing estimators of the BG index based on high-frequency observations only achieve rates of convergence which are suboptimal by a polynomial factor. In this paper, a novel estimator for the BG index and the successive BG indices is presented, attaining the optimal rate of convergence. If an additional proportionality factor needs to be inferred, the proposed estimator is rate-optimal up to logarithmic factors. Furthermore, our method yields a new efficient volatility estimator which accounts for jumps of infinite variation. All parameters are estimated jointly by the generalized method of moments. A simulation study compares the finite sample behavior of the proposed estimators with competing methods from the financial econometrics literature.

**Keywords: high-frequency; method of moments; jump activity; Fisher information; non-diagonal rate matrix; asymptotic distribution;

MSC 2000 subject classification: primary 62M05; secondary 60G51;**

1 Introduction

Models for continuous time stochastic processes with jumps have gained increased interest in the statistical literature, most prominently in financial econometrics where they are used as a model for asset prices (Andersen et al.,, 2002; Christensen et al.,, 2014). The jump behavior of these processes $X_{t}$ can be broadly characterized in terms of the jump activity index, given by

[TABLE]

Here, $\Delta X_{s}=X_{s}-X_{s-}$ denotes the size of a jump at time $s$ . If $X_{t}$ is a Lévy process, $\alpha$ is also known as the Blumenthal-Getoor index (Blumenthal and Getoor,, 1961). The index $\alpha$ depends on the small jumps only, and for semimartingales, its range is $\alpha\in[0,2]$ . Various qualitative properties of the process $X_{t}$ can be expressed in terms of the jump activity index. If the process has only finitely many jumps in total, then $\alpha=0$ , and if the jumps are of finite variation, we have $\alpha\leq 1$ . Conversely, $\alpha<1$ implies jumps of finite variation. Furthermore, the value of $\alpha$ has implications for various econometric procedures. For example, if the jumps are treated as a nuisance, jump-robust estimation of integrated volatility requires $\alpha<1$ (Jacod and Reiss,, 2014), as well as an efficient drift estimator due to Gloter et al., (2018). In these applications, a higher jump activity typically induces a non-negligible bias which can not be easily corrected if the jumps are considered as a nuisance. Hence, highly active jumps need to be modeled more explicitly, as done by Amorino and Gloter, (2018) for drift estimation, and by Jacod and Todorov, (2014, 2016) for volatility estimation.

As the jump activity index is a central property of infinite activity jump models, it is natural to consider statistical estimation of its precise value. Recent interest in this topic has been initiated by Aït-Sahalia and Jacod, (2009), who study the estimation of $\alpha$ based on discrete high-frequency observations $X_{i/n},i=1,\ldots,n$ , where $X$ is an Itô semimartingale with a non-vanishing diffusion component. They specify (1) more precisely by defining $\alpha$ in terms of the spot jump compensator $\nu_{t}$ , assuming that $\nu_{t}\left((-x,x)^{c}\right)=r_{t}|x|^{-\alpha}+\mathcal{O}(|x|^{\delta-\alpha})$ as $|x|\to 0$ for a predictable process $r_{t}$ , and some $\delta>0$ . The statistical challenge is that, based on discrete observations at a given frequency, the small jumps can hardly be distinguished from the continuous diffusion movement. The solution of Aït-Sahalia and Jacod, (2009) is to introduce a threshold sequence $\tau_{n}\propto h_{n}^{\omega}\to 0$ and consider

[TABLE]

If $\omega<1/2$ , the contribution of the diffusion towards the statistic $U(\tau_{n})$ will be negligible. The jump activity can be identified via the approximate scaling relation $U(\tau_{n})\propto\tau_{n}^{-\alpha}$ , and Aït-Sahalia and Jacod, (2009) show that this approach lends itself to derive an estimator of $\alpha$ with rate of convergence $n^{\alpha/10}$ . Replacing the indicator in (2) by a suitable smooth function, Jing et al., (2012) improve this rate to $n^{\alpha/8}$ . So far, the best rates have been achieved by Reiß, (2013) for the case that $X_{t}$ is a Lévy process, and by Bull, (2016) for Itô semimartingales. Both authors construct estimators which converge at rate $n^{\alpha/4-\epsilon}$ for arbitrary $\epsilon>0$ . In both cases, the precise form of the estimator depends on the desired rate defect $\epsilon>0$ .

In the considered high-frequency setting, the optimal rate of convergence for estimating $\alpha$ is conjectured to be $n^{\alpha/4}$ , up to logarithmic factors. This lower bound is justified by the results of Aït-Sahalia and Jacod, (2012), who study the diagonal entries of the Fisher matrix of a fully parametric submodel consisting of the sum of a Brownian motion and a symmetric $\alpha$ -stable Lévy motion. A matching LAN result is not available since the off-diagonal entries have not been studied. This lower bound is discussed in Section 3. It should be highlighted that the achievable rate of convergence for estimating $\alpha$ depends on whether the process contains a non-vanishing diffusion component. If we consider a pure-jump Itô semimartingale, the jump activity index can be estimated at rate $\sqrt{n}$ based on high-frequency observations (Todorov,, 2015).

Although the estimators of Reiß, (2013) and Bull, (2016) almost achieve the optimal rate of convergence, there is so far no procedure which attains the $n^{\alpha/4}$ lower bound, even in the case where $X_{t}$ is a Lévy process. This issue has also been formulated as an open problem by Reiß, (2013). In this paper, we propose a new estimator of $\alpha$ for the Lévy case. If only $\alpha$ is unknown, the estimator achieves the optimal rate of convergence, matching the lower bound of Aït-Sahalia and Jacod, (2012). If an additional proportionality factor $r$ needs to be estimated, our estimator is rate-optimal up to a factor of $\log n$ for both $r$ and $\alpha$ . Furthermore, we show that the diagonally rescaled Fisher matrix in the submodel considered by Aït-Sahalia and Jacod, (2012) is asymptotically singular for the combined parameter $(\alpha,r)$ , and hence we conjecture that our rate of convergence is in fact optimal. Our procedure also yields an efficient estimator of the volatility $\sigma^{2}$ of the diffusion component of $X_{t}$ in the presence of jumps of infinite variation. Under analogous conditions on the jump behavior, Jacod and Todorov, (2014, 2016) have derived a different efficient estimator of volatility which is robust to highly active jumps. Hence, our estimator is an alternative to the method of Jacod and Todorov, (2014), although the latter is valid for Itô semimartingales and we restrict our attention to Lévy processes. The proposed estimator is based on the generalized method of moments, and we estimate the jump and the diffusion parameters jointly in a single step as the solution of a system of estimating equations.

Our model allows for an asymmetric behavior of the small jumps. In particular, for a Lévy process $X_{t}$ with characteristic triplet $(\mu,\sigma^{2},\nu)$ , we suppose that the Lévy measure $\nu$ is locally stable in the sense that, for $z$ close to [math],

[TABLE]

Here, $M$ is a natural number, $r_{m}^{\pm}\geq 0$ , $m=1,\ldots,M$ , and the $0<\alpha_{M}<\ldots<\alpha_{1}<2$ are the successive Blumenthal-Getoor indices, as introduced by Aït-Sahalia and Jacod, (2012). The approximation in (3) will be made precise in the sequel. In particular, the BG index of $X_{t}$ will be $\alpha=\alpha_{1}$ . We construct an estimator for the parameter vector $\theta\in\mathbb{R}^{3M+1}$ consisting of the volatility $\sigma^{2}$ , the indices $\alpha_{m}$ , and the proportionality factors $r_{m}^{\pm}$ .

The remainder of this paper is structured as follows. In Section 2, we present our model and the proposed estimator. A central limit theorem is given, establishing the rate $n^{\alpha/4}$ . The rate of convergence and related lower bounds are discussed in Section 3. By means of a simulation study (Section 4), we compare the finite sample properties of our method with the jump activity estimators of Bull, (2016); Reiß, (2013) and the volatility estimator of Jacod and Todorov, (2014). All technical results, which might be of independent interest, are outlined in Section 5.1, and the detailed proofs are gathered in Section 5.2.

1.1 Notation

For two real numbers $a,b$ , we denote $a\wedge b=\min(a,b)$ , $a\vee b=\max(a,b)$ . The indicator function of a set $A$ is denoted as $\mathds{1}_{A}$ . For a function $f=f(a,b,\ldots)$ , $\partial_{a}f$ denotes the partial derivative w.r.t. $a$ , and for a function $f(\theta)\in\mathbb{R}^{m}$ with $\theta\in\mathbb{R}^{k}$ , the gradient matrix is denoted by $(\mathrm{D}_{\theta}f)_{j,l}=\partial_{\theta_{l}}f_{j}$ . For $\delta>0$ , $B_{\delta}(0)$ is the ball around [math] with radius $\delta$ in $\mathbb{R}^{k}$ , where $k$ is evident from the context. $\boldsymbol{I}_{d}\in\mathbb{R}^{d\times d}$ denotes the identity matrix. The multivariate normal distribution with covariance matrix $\Sigma$ and mean [math] is denoted as $\mathcal{N}(0,\Sigma)$ , and $\Rightarrow$ denotes weak convergence of probability measures resp. random elements. The expectation operator is $\mathbb{E}$ , and dependence upon a parameter $\theta$ is denoted as $\mathbb{E}_{\theta}$ .

2 Model and estimator

Consider a univariate Lévy process $X_{t}$ , $X_{0}=0$ , with characteristic triplet $(\mu,\sigma^{2},\nu)$ for a drift parameter $\mu\in\mathbb{R}$ , volatility parameter $\sigma^{2}>0$ , and a Lévy measure $\nu$ , i.e. $\int(1\wedge|z|^{2})\,\nu(dz)<\infty$ . We choose an odd truncation function $\xi$ such that $|\xi|\leq 2$ and $\xi(z)=z$ for $z\in(-1,1)$ . Then $X_{t}$ admits the Lévy-Itô decomposition

[TABLE]

where $N(dz,ds)$ is a Poisson point process with intensity measure $\nu(dz)\otimes ds$ , and $B_{t}$ is a standard Brownian motion, independent of $N$ . The value of $\mu$ depends on the choice of the truncation function $\xi$ , but for our purposes, it will turn out that $\mu$ is negligible anyways. To make the approximation (3) precise, we suppose that

[TABLE]

for some $L>0$ and $\rho>0$ . The approximating measure $\tilde{\nu}$ is given by the Lebesgue density

[TABLE]

for some natural number $M$ and parameters $\boldsymbol{\alpha}=(\alpha_{1},\ldots,\alpha_{M})\in(0,2)^{M}$ , and $\boldsymbol{r}=(r_{1}^{+},r_{1}^{-},\ldots,r_{M}^{+},r_{M}^{-})\in\mathbb{R}_{\geq 0}^{2M}$ . The remainder term in (5) is treated as a nuisance. In particular, this remainder may still consist of infinite activity jumps. Our main result will require $\rho<\alpha_{M}$ , such that the nuisance jumps are in a sense less active than the Lévy measure $\tilde{\nu}$ and asymptotically negligible. The parameters of the modeled part are summarized as

[TABLE]

where $\Theta$ contains all parameter vectors $\theta$ as specified, such that additionally

[TABLE]

The value $\alpha=\alpha_{1}$ is of central importance. In particular, we need to impose the lower bound $\alpha_{M}>\alpha/2$ to ensure identifiability of the full parameter vector $\theta$ , see Aït-Sahalia and Jacod, (2012). Note that the definition (6) is the same as given by Jacod and Todorov, (2016) for the symmetric case.

In the high-frequency sampling setting considered here, we are given $n$ observations $X_{ih_{n}}$ , $i=1,\ldots,n$ with observation frequency $h_{n}\to 0$ such that $nh_{n}=T$ is constant. Without loss of generality, let $T=1$ and $h=h_{n}=1/n$ . Equivalently, we observe the $n$ increments $\Delta_{n,i}X=X_{ih_{n}}-X_{(i-1)h_{n}}\sim X_{h_{n}}$ , which constitute a triangular array of random variables with iid rows. The law of $X_{h_{n}}$ is not fully described by the parameters $(\sigma^{2},\boldsymbol{r},\boldsymbol{\alpha})$ due to the remainder in (5). Hence, we approximate it by a fully specified Lévy process $\tilde{Z}_{t}$ with characteristic triplet $(0,\sigma,\tilde{\nu})$ . The process $\tilde{Z}_{t}$ may be represented as

[TABLE]

where $B_{t},S_{t}^{m}$ , $m=1,\ldots,M$ , are independent Lévy processes, $B_{t}$ is a standard Brownian motion, and the $S_{t}^{m}$ are skewed $\alpha_{m}$ -stable process with Lévy measure $|z|^{-1-\alpha_{m}}(r_{m}^{+}\mathds{1}_{z>0}+r_{m}^{-}\mathds{1}_{z<0})$ .

We suggest to estimate the parameter $\theta$ via the method of moments. In particular, we choose $3M+1$ functions $f_{j}:\mathbb{R}\to\mathbb{R}$ , $\boldsymbol{f}=(f_{1},\ldots,f_{3M+1})$ , and a suitable scaling factor $u=u_{n}$ , and define $\hat{\theta}=\hat{\theta}_{n}$ to be a solution of the equation

[TABLE]

Here and in the following, $\mathbb{E}_{\theta}f(\tilde{Z}_{h})$ denotes the expectation such that $\tilde{Z}_{h}$ is determined by the parameter vector $\theta$ . Since $\tilde{Z}_{h}$ is a fully parametric approximation of $X_{h}$ , the function $F_{n}(\theta)$ can be be computed numerically, such that $\hat{\theta}_{n}$ is a feasible estimator. To distinguish a generic parameter value from the parameters governing $X_{t}$ , we denote by $\theta_{0}$ the true parameter such that (5) holds.

To study the limit of $\hat{\theta}_{n}$ , we employ the standard framework for estimating equations as reviewed by Jacod and Sørensen, (2018). Under the assumptions imposed below, we show that $\hat{\theta}_{n}-\theta_{0}\approx-(\mathrm{D}_{\theta}F_{n}(\theta_{0}))^{-1}F_{n}(\theta_{0})$ , up to negligible terms. In order for $\hat{\theta}_{n}$ to have good asymptotic properties, the choices of the moment functions $\boldsymbol{f}$ and the scaling factor $u_{n}$ are crucial. In particular, to derive a central limit theorem for $F_{n}(\theta_{0})$ (see Lemma 5.4), we need to control the sampling variance in (8) as well as the bias incurred by approximating $X_{t}$ by $\tilde{Z}_{t}$ . Furthermore, the asymptotic behavior of $\mathrm{D}_{\theta}F_{n}(\theta)$ as $n\to\infty$ needs to be treated (see Lemma 5.5). To this end, the following properties turn out to be sufficient.

Condition (F1).

For $j=1,\ldots,3M+1$ , the functions $f_{j}\in\mathcal{C}^{3}(\mathbb{R})$ satisfy $\|f_{j}^{(k)}\|_{\infty}<\infty$ for $k=0,1,2,3$ , and $f_{j}^{\prime}\in L_{1}(\mathbb{R})$ .

The smoothness imposed by Condition ** (F1).** is used to bound the bias incurred by approximating $\mathbb{E}\boldsymbol{f}(uX_{h_{n}})$ by $\mathbb{E}_{\theta}\boldsymbol{f}(u\tilde{Z}_{h_{n}})$ , see Corollary 5.3 below. To control the sampling variance, we do not only require smoothness of the employed moment functions, but they further need to be of a specific shape.

Condition (F2).

The function $f_{1}$ is symmetric and satisfies $f_{1}(0)=f_{1}^{\prime}(0)=0\neq f_{1}^{\prime\prime}(0)$ . The functions $f_{j}$ , $j=2,\ldots,3M+1$ , are identically zero on the interval $[-\eta,\eta]$ for some $\eta>0$ .

Additional identifiability conditions are specified in assumption ** (I).** below. The first moment function $f_{1}$ is approximately quadratic near zero, and will serve to identify the volatility $\sigma^{2}$ . The functions $f_{j}(x)$ are smooth thresholds, which distinguish the diffusion from the jump component. An example of suitable moment functions is given in section 4. To ensure that the threshold is effective, we require that $u_{n}X_{h_{n}}\to 0$ in probability, i.e. $u_{n}=o(\sqrt{n})$ . By choosing an appropriate scaling sequence as follows, the moments $\mathbb{E}f_{j}(u_{n}\tilde{Z}_{h_{n}})$ , $j\geq 2$ , will be dominated by the jump component.

Condition (U).

$u_{n}\to\infty$ such that $u_{n}=\frac{\tau\sqrt{n}}{\sqrt{\log n}}$ for some $\tau<\frac{\eta}{\sigma\sqrt{8}}$ .

Although potentially not sharp, the upper bound on the factor $\tau$ is required to derive our asymptotic result. For details, see the technical Lemma 5.1 below and the subsequent discussion. When choosing $u_{n}$ in accordance with condition ** (U).**, it suffices to use a reasonable upper bound on $\sigma$ . Furthermore, the simulation results presented in section 4 show that larger values of $u_{n}$ also perform well in finite samples.

To formulate our main result on the asymptotic behavior of $\hat{\theta}$ , we introduce the quantities

[TABLE]

which exist if $\|g\|_{\infty},\|g^{\prime\prime}\|_{\infty}<\infty$ . Furthermore, we introduce the matrices

[TABLE]

and the matrix $A(\theta)\in\mathbb{R}^{(3M+1)\times(3M+1)}$ , given by

[TABLE]

and for $m=1,\ldots,M$ , $j=2,\ldots,3M+1$ ,

[TABLE]

These derivatives exist because $\|f\|_{\infty},\|f^{\prime\prime}\|_{\infty}$ are finite. Finally, we introduce the symmetric positive semidefinite matrix $\Sigma(\theta)$ given by

[TABLE]

If clear from the context, we will omit the dependence on $\theta$ . Using this notation, we can formulate the remaining identifiability condition.

Condition (I).

For the true parameter $\theta_{0}$ , $A(\theta_{0})$ is regular.

*Remark 1**.*

Analyzing the degrees of freedom of the equation $|A(\theta)|=0$ suggests that condition ** (I).** is, in fact, the generic case. To demonstrate this point, we construct a set of moment functions satisfying the identifiability condition. Consider the case $M=1$ with $\alpha_{m}=\alpha$ and $r_{m}^{\pm}=r^{\pm}$ , $m=1$ . We can construct a set of moment functions satisfying condition ** (I).** as follows. Let $f_{1}=f$ and $g$ be symmetric functions satisfying conditions ** (F1).** such that $f_{1}^{\prime\prime}(0)\neq 0$ , and $g$ vanishes on $[-1,1]$ . Furthermore, denote $a=\mathcal{J}_{\alpha}^{+}g(0)=\mathcal{J}_{\alpha}^{-}g(0)$ , and $b=\partial_{\alpha}\mathcal{J}_{\alpha}^{\pm}g(0)$ . We set $f_{2}(x)=g(x),f_{3}(x)=g(2x)$ , and $f_{4}(x)=g(x)\mathds{1}_{x>0}+g(2x)\mathds{1}_{x<0}$ . Note that $\mathcal{J}^{\pm}f_{3}(0)=2^{\alpha}\mathcal{J}^{\pm}g(0)=2^{\alpha}a$ , as well as $\mathcal{J}^{+}f_{4}(0)=a$ , and $\mathcal{J}^{-}f_{4}(0)=2^{\alpha}a$ . Then one can check that

[TABLE]

with determinant $\det(A)=-\frac{f^{\prime\prime}(0)}{2}(r^{+}+r^{-})\,a^{3}\,2^{\alpha}\log 2$ . Hence, $A(\theta_{0})$ is regular for $(r^{+}+r^{-})>0$ and all $\alpha\in(0,2)$ if $g$ is chosen such that $a\neq 0$ . This is in particular the case for the choice of the moment functions for the simulation study in section 4.

The main result of this paper is the consistency and asymptotic normality of $\hat{\theta}_{n}$ , as summarized by the following theorem.

Theorem 2.1.

Let $X_{t}$ be a Lévy process satisfying (5) with some $\rho<\alpha/2$ , and parameter vector $\theta_{0}\in\Theta$ . Let $\boldsymbol{f}$ satisfy assumptions ** (F1).** and ** (F2).**, and be such that $A(\theta_{0})$ is regular, and let $u_{n}\to\infty$ be chosen according to ** (U).**. Then there exists a sequence of random vectors $\hat{\theta}_{n}$ solving (8), such that $\hat{\theta}_{n}\to\theta$ in probability as $n\to\infty$ . This sequence is eventually unique, and, as $n\to\infty$ ,

[TABLE]

The resulting rate of convergence for the BG index $\alpha=\alpha_{1}$ is thus found to be $(n\log n)^{\frac{\alpha}{4}}$ , which improves upon existing estimators and matches the lower bound of Aït-Sahalia and Jacod, (2012) up to logarithmic factors. However, the rate matrix of Theorem 2.1 is non-diagonal. The phenomenon of a non-diagonal rate matrix has also been observed in the pure jump case, i.e. $\sigma^{2}=0$ , see Brouste and Masuda, (2018). We further discuss this aspect and the resulting marginal rates of convergence for $\hat{\alpha}_{m}$ and $\hat{r}_{m}^{\pm}$ in the next section. Nevertheless, the matrices $\Gamma_{n}^{-1}$ , $A(\theta_{0})$ , and $\Sigma(\theta_{0})$ are block-diagonal, such that the volatility estimator $\hat{\sigma}^{2}$ is asymptotically independent of the estimator of the jump part.

The presented central limit theorem also holds for the fully specified case without nuisance, i.e. $L=0$ in (5). Even in this parametric case, we find that a simple GMM estimator based on $3M+1$ fixed moment functions, corresponding to $u_{n}=1$ , will not achieve the best rate of convergence. A careful construction of the estimating equation (8) is thus not only required to handle the nuisance term, but also for the underlying parametric problem itself.

The proposed estimator for $\alpha$ can be contrasted with existing methods in the literature. In an earlier study, Reiß, (2013) suggests a test procedure for the value of $\alpha$ based on a statistic $T^{m}_{n}$ with tuning parameter $m\in\mathbb{N}$ . Therein, it is established that $T_{n}^{m}\to Q(\alpha)$ as $n\to\infty$ at rate $n^{\frac{\alpha}{4}-\epsilon(m)}$ , and $\epsilon(m)\to 0$ as $m\to\infty$ . By inverting the function $Q$ , this approach yields a near-optimal estimator for $\alpha$ . The statistics $T_{n}^{m}$ are constructed based on nonlinear sample moments as in (8), where the $f_{j}$ are linear combinations of trigonometric functions, i.e. $f_{j}(x)=\sum_{k}w_{k,j}\exp(i\lambda_{k}x)$ . Choosing the weights $w_{k,j}$ carefully such that $\sum_{k}w_{k,j}\lambda_{k}^{2p}=0$ for $p=1,\ldots,m-1$ , Reiß, (2013) is able to reduce the variance of the corresponding sample moments. The arbitrarily small defect in the rate of convergence $n^{\alpha/4-\epsilon(m)}$ derived therein is thus due to the sampling variance. In contrast, by choosing the moment functions to vanish near zero according to Condition ** (F2).**, we obtain a smaller variance of the sample moments.

An alternative estimator achieving the rate $n^{\alpha/4-\epsilon}$ is presented by Bull, (2016), which also uses functions which vanish near zero. Therein, the value $\mathbb{E}\boldsymbol{f}(u_{n}X_{h_{n}})$ is approximated by a finite series expansion, and extending this expansion reduces the rate defect $\epsilon$ . In contrast, we use the approximation $\mathbb{E}\boldsymbol{f}(u_{n}X_{h_{n}})\approx\mathbb{E}\boldsymbol{f}(u_{n}\tilde{Z}_{h_{n}})$ . Although the latter value is not available in explicit form and needs to be determined numerically, this approach allows us to decrease the bias of the estimating equation further than by any finite series expansion. In particular, we only incur a bias due to approximating the Lévy measure of $X_{t}$ , but not due to a discretization of the time evolution of the process. Thus, our method effectively circumvents the variance issue of Reiß, (2013) and the bias issue of Bull, (2016). This allows us to eliminate the polynomial rate defect and achieve a faster rate of convergence.

3 Asymptotic optimality

It is natural to ask whether our proposed estimator is asymptotically optimal. From Theorem 2.1, we find that

[TABLE]

which matches the optimal estimator in the situation without jumps. That is, $\hat{\sigma}^{2}_{n}$ is efficient. In general, jumps of infinite variation reduce the achievable rate of convergence for volatility estimators (Jacod and Reiss,, 2014). Here, we are able to recover efficiency by modeling the infinite variation part of the jump measure explicitly via (5). The same methodology has been applied by Jacod and Todorov, (2014, 2016) to construct an efficient estimator of $\sigma^{2}$ . Note that the latter studies treat more general types of semimartingales, while we only derived a result for Lévy processes. In contrast to the existing estimators, which use a multi-step debiasing procedure, we determine $\hat{\sigma}^{2}$ by a single set of estimating equations. While our approach is conceptually simple, solving the estimating equations (8) is computationally expensive. A comparison of the finite sample performance is presented in Section 4.

As the asymptotic variance of the estimators $\alpha_{m}$ and $r_{m}^{\pm}$ depends on the choice of $\boldsymbol{f}$ , they can not be expected to be variance efficient. Furthermore, they are coupled via $\Gamma_{n}$ and via the matrix $A(\theta_{0})$ , which is in general dense. Inspecting the limit in Theorem 2.1, we find that

[TABLE]

To assess these rates of convergence, we may compare with the lower bound of Aït-Sahalia and Jacod, (2012). Therein, the authors compute the diagonal terms of the Fisher information $\mathcal{I}^{n}_{\theta}$ based on $n$ observations of $\tilde{Z}_{1/n}$ for the symmetric case $r_{m}^{+}=r_{m}^{-}=r_{m}$ and $M=2$ . Their analysis of the diagonal entries $\mathcal{I}^{n}_{\alpha_{m},\alpha_{m}}$ and $\mathcal{I}^{n}_{r_{m},r_{m}}$ suggests that an asymptotically optimal estimator $(\hat{\alpha}_{m}^{*},\hat{r}_{m}^{*})$ should satisfy

[TABLE]

Notably, even for $M=1$ , the rates (11) are faster than (10) by a logarithmic factor.

This difference could potentially be explained by the neglected off-diagonal terms of $\mathcal{I}_{\theta}$ . A similar phenomenon occurs in the pure jump case $\sigma^{2}=0$ , $M=1$ , where for any sequence of diagonal matrices $D_{n}$ , the limit of $D_{n}\mathcal{I}^{n}_{(\alpha,r)}D_{n}$ is singular, see (Masuda,, 2015, Thm. 3.4) and (Aït-Sahalia and Jacod,, 2008, Thm. 2). Recently, Brouste and Masuda, (2018) studied this case, and established the LAN property with a non-diagonal rescaling matrix $D_{n}$ . They find that the optimal rate of convergence is slower than suggested by the diagonal entries of the Fisher matrix, by a factor of $\log n$ . A similar phenomenon is observed when estimating the Hurst parameter of a fractional Brownian motion based on high-frequency observations (Brouste and Fukasawa,, 2018). There is no LAN result available for estimation of the BG index in the case $\sigma^{2}>0$ , and a full investigation of the LAN property in the present case is out of scope of this paper. Nevertheless, we can adapt the proof of Aït-Sahalia and Jacod, (2012) to unveil the off-diagonal entries $\mathcal{I}^{n}_{\alpha_{1},r_{1}}$ . It turns out that the diagonally rescaled Fisher matrix is asymptotically singular, just as in the pure-jump case.

Proposition 3.1.

Let $\mathcal{I}^{h}$ denote the Fisher information matrix of $\tilde{Z}_{h}$ with $M=1$ and $\alpha_{1}=\alpha$ , $r_{1}^{+}=r_{1}^{-}=r$ . Then, as $h\to 0$ ,

[TABLE]

In particular, the limiting matrix is singular.

The diagonal entries of the Fisher information matrix should match the optimal rates of convergence in the case where only a single parameter is unknown, e.g. if $(\sigma^{2},r_{1}^{+},r_{1}^{-})$ are known and $\alpha_{1}$ should be estimated. In this situation, a natural version of our estimator is to consider only a single moment function $f$ . Analogous to (8), for any $m\in\{1,\ldots,M\}$ , we may estimate $\alpha_{m}$ as the solution of

[TABLE]

With a slight abuse of notation, we may also estimate $r_{m}^{\pm}$ by the equation $\tilde{F}_{n}(r_{m}^{\pm})=0$ . To distinguish jumps and diffusion, we suppose $f$ satisfies the same conditions as $f_{2},\ldots,f_{3M+1}$ , i.e. it should vanish around zero.

Proposition 3.2.

Let $X_{t}$ be a Lévy process satisfying (5) with some $\rho<\alpha_{1}/2$ , and parameter vector $\theta_{0}\in\Theta$ . Let $f$ be a non-negative function satisfying ** (F1).**, and $f(x)=0$ for $x\in[-\eta,\eta]$ , and choose $u_{n}\to\infty$ such that ** (U).** holds. Fix some $m\in\{1,\ldots,M\}$ , and suppose that $\mathcal{J}_{\alpha_{m}}^{\pm}f(0)>0$ . Then there exists a consistent sequence of estimators $\hat{\alpha}_{m}$ satisfying $\tilde{F}_{n}(\hat{\alpha}_{m})=0$ , such that $\hat{\alpha}_{m}\to\alpha_{m}$ in probability as $n\to\infty$ , and

[TABLE]

Under the same conditions, and if all parameters except for $r_{m}^{+}$ resp. $r_{m}^{-}$ are known, there exists a consistent sequence of estimators $\hat{r}_{m}^{\pm}$ solving $\tilde{F}_{n}(\hat{r}_{m}^{\pm})=0$ such that, as $n\to\infty$ ,

[TABLE]

Since $u_{n}$ is of order $\sqrt{n/\log n}$ , Proposition 3.2 establishes precisely the rates (11). In the setting of Aït-Sahalia and Jacod, (2012), in particular $M=2$ , this shows that $\hat{\alpha}_{m}$ resp. $\hat{r}_{m}^{\pm}$ are rate efficient if the remaining parameters $\theta$ are known. In contrast, if all parameters $\theta$ are unknown, $\hat{\theta}$ achieves the optimal rate of convergence, up to a logarithmic factor. Due to the singularity of the Fisher matrix, we conjecture that the achieved rates (10) are in fact optimal.

4 Simulation study

By means of a Monte Carlo study, we compare the finite sample performance of our estimator with the estimators of Reiß, (2013) and Bull, (2016) for the Blumenthal-Getoor index $\alpha$ , and with the volatility estimator of Jacod and Todorov, (2014). To this end, we sample paths of a Lévy process $X_{t}$ given by

[TABLE]

We denote by $S^{\alpha,\beta}_{t}$ the $\alpha$ -stable Lévy motion with skewness parameter $\beta\in(-1,1)$ . That is, the characteristic function of $S^{\alpha,\beta}_{t}$ is given by (see e.g.Zolotarev, (1986))

[TABLE]

The Lévy measure corresponding to this standardization can be expressed in the form (6) with $M=1$ , $\frac{r^{+}-r^{-}}{r^{+}+r^{-}}=\beta$ , and $(r^{+}+r^{-})=\frac{1}{\Gamma(1-\alpha)\cos(\pi\alpha/2)}$ if $\alpha\neq 1$ . Here, we will set $\beta=-1/3$ and study the cases $\alpha=1.3$ and $\alpha=1.7$ . Then (5) is satisfied with $\rho=0.5$ , such that $S^{0.5,0}_{t}$ is a nuisance term, and $\tilde{Z}_{t}=B_{t}+S_{t}^{\alpha,\beta}$ . In view of applications in financial econometrics, we consider the time horizon $T=1$ , and sampling frequencies $h=0.2/23400$ , $h=1/23400$ , and $5/23400$ . This sampling schemes correspond to $0.2$ resp. $1$ resp. $5$ seconds per quote on a trading day of $6.5$ hours.

To determine the solution of the estimating equation (8), we need to compute the moments $\mathbb{E}_{\theta}\boldsymbol{f}(u\tilde{Z}_{h})$ and their gradients. This can be done numerically by means of a continuous Fourier transform since $\mathbb{E}\exp(\boldsymbol{i}\lambda\tilde{Z}_{h})$ is available in closed form. The employed moment functions $f_{1},\ldots,f_{4}$ are handcrafted to satisfy ** (F1).** and ** (F2).**. In our simulations, we use

[TABLE]

Note that $f_{2},f_{3},f_{4}$ vanish on $[-1/8,1/8]$ . We use the rescaling factor $u=1/\sqrt{h|\log h|}$ . Although this choice of $u$ is too large to comply with assumption ** (U).**, we found it to perform better than smaller values for the given sampling scenario.

The methods of Reiß, (2013) and Bull, (2016) each have a tuning parameter $m\in\mathbb{N}$ , and larger values of $m$ increase the rate of convergence. However, smaller values of $m$ can be superior in finite samples. In our simulations, we found that the estimator of Bull performed best when setting $m=3$ , and the estimator of Reiß performed best when setting $m=2$ , across all observation frequencies. Furthermore, the method of Reiss involves a rescaling parameter $U_{n}$ and two weighting measures $w_{1}$ , $w_{2}$ . We choose the weighting measure $w_{1}$ to be supported on the set $\{1/m,2/m,\ldots,1\}$ , and $w_{2}$ to be supported on the set $\{2/m,4/m,\ldots,2\}$ . The truncation parameter is set to $U=h^{-(1-2m)/(4m-1)}$ , as suggested by equation (3.8) therein.

In Table 1, we compare the simulated performance of our moment estimator for $\alpha$ and $\sigma^{2}$ with the estimators of Jacod and Todorov, (2014), Reiß, (2013), and Bull, (2016). For the latter two, we choose the best tuning parameter $m$ as specified above. The estimator of Jacod and Todorov, (2014) is implemented as in equation (5.3) therein, with $\zeta=1.5$ and $u=|\log h|^{\frac{1}{30}}$ . It is found that the new estimators perform best in the considered setting The good performance of the estimator of Reiß in the case $\alpha=1.7$ is somewhat surprising, since the analysis of Reiß, (2013) only yields a suboptimal rate of convergence. However, for the latter estimator, no central limit theorem is available. Hence, it is possible that the estimator in fact converges at a rate which is faster than the rate derived by Reiß, (2013). It should also be noted that all benchmarked methods require various tuning parameters. Most notably, all methods require some form of scaling factors. Furthermore, our new estimator depends on the the employed moment functions $f_{j}$ , and the estimator of Bull, (2016) requires the choice of a truncation kernel function. It is thus possible that a very careful choice of these parameters might affect the ranking implied by Table 1.

The volatility estimator $\hat{\sigma}^{2}$ is efficient, and from (9), the error $\hat{\sigma}^{2}-\sigma^{2}$ should be of order $\sqrt{2\,h\sigma^{4}}$ . From the results of Table 1, we find that this asymptotic performance is not achieved for the considered sample sizes. This defect holds for our proposed estimator as well as for the benchmark method of Jacod and Todorov, (2014), and it is bigger for large values of $\alpha$ . This is potentially due to the relatively large jump component of the simulated process (13). On the other hand, the asymptotic distribution of Theorem 2.1 yields a good approximation of the finite sample behavior of $\hat{\alpha}$ , as shown in Figure 1. Clearly, the match with the asymptotic normal distribution improves for smaller $h$ . Furthermore, the approximation is better for the smaller value $\alpha=1.3$ .

5 Technical tools

In this section, we present the proofs of Theorem 2.1 and Propositions 3.1 and 3.2. Preliminary technical results are presented in Subsection 5.1, as they might be of independent interest, in particular Lemma 5.1 and Corollary 5.3. The detailed proofs are presented in Subsection 5.2.

5.1 Preliminary results

To study the asymptotic behavior of the estimating equation (8) by standard techniques (see e.g. Jacod and Sørensen, (2018)), we need

•

a central limit theorem for the term $\frac{1}{n}\sum_{i=1}^{n}\boldsymbol{f}(u_{n}\Delta_{n,i}X)-\mathbb{E}_{\theta}\boldsymbol{f}(u_{n}\tilde{Z}_{h})$ , and

•

properties of the derivatives $\mathrm{D}_{\theta}\mathbb{E}_{\theta}\boldsymbol{f}(u_{n}\tilde{Z}_{h})$ .

To determine asymptotic variances, as well as for some technical steps of the following proofs, it is useful to derive some explicit approximations of $\mathbb{E}\boldsymbol{f}(u_{n}\tilde{Z}_{h})$ .

Lemma 5.1.

Let $f\in\mathcal{C}^{2}$ be such that $f,f^{\prime}$ and $f^{\prime\prime}$ are bounded and $f(0)=0$ , and let $\tilde{X}_{t}$ be a Lévy process with characteristic triplet $(\mu,\sigma^{2},\tilde{\nu})$ . The implicit constants in the following expressions depend on $f$ and $(\mu,\sigma^{2},\tilde{\nu})$ , but neither on $t$ nor on $u$ . Moreover, all $\mathcal{O}(\cdot)$ and $o(\cdot)$ terms are bounded resp. vanishing uniformly on compacts in $\Theta$ .

(i)

If $f(x)=0$ for $|x|\leq\eta$ , then for any $\lambda\in(0,1)$ such that $u\leq\frac{(1-\lambda)\eta}{\sigma\sqrt{8t|\log t|}}$ , as $t\to 0$ ,

[TABLE]

where

[TABLE] 2. (ii)

*If, alternatively, $f(0)=0$ but $f^{\prime\prime}(0)\neq 0$ , then for any * $u=o(1/\sqrt{t})$

[TABLE] 3. (iii)

If $f(0)=0,f^{\prime\prime}(0)=0$ but $f^{(4)}\neq 0$ , and $f^{(3)},f^{(4)}$ are bounded, then for any $u=o(1/\sqrt{t})$

[TABLE] 4. (iv)

If $f(0)=0$ and $\mu=0$ , $\sigma^{2}=0$ , then there exists a constant $\tilde{C}$ bounded uniformly on compacts, such that for all $f$ and all $u>1$ , $t\geq 0$ ,

[TABLE]

The case (i), which is exploited in the proofs several times, imposes a subtle upper bound on $u$ . Although this bound need not be sharp, the Lemma will not hold for $u=\tau/\sqrt{t|\log t|}$ if $\tau$ is too large. To make this plausible, note that for an $\alpha$ -stable process $S_{t}^{\alpha}$ , the probability $P(|S_{t}^{\alpha}|\geq\eta\sqrt{t|\log t|}/\tau)$ tends to zero as $t\to 0$ , roughly polynomially in $t$ . On the other hand, for the Brownian motion, $P(|B_{t}|>\eta\sqrt{t|\log t|}/\tau)=P(|B_{1}|>\eta\sqrt{|\log t|}/\tau)\to 0$ polynomially as well, but the polynomial order of this decay will depend on the specific value of $\tau$ . For the jump term to dominate, as in case (i) of Lemma 5.1, $\tau$ must be small. The uniformity w.r.t. $\theta$ of the previous results will be used later on to derive the consistency of the estimator.

Another ingredient to obtain a central limit theorem is a bias bound, i.e. a bound on the error of approximating $\mathbb{E}\boldsymbol{f}(u_{n}\Delta_{n,i}X)$ by $\mathbb{E}_{\theta}\boldsymbol{f}(u_{n}\tilde{Z}_{h})$ . For two random variables $X$ and $Y$ , recall the definition of the 1-Wasserstein metric $d_{W}$ and the total variation distance $d_{TV}$ given by

[TABLE]

where the supremum is taken over all bounded resp. Lipschitz continuous, measurable functions $g:\mathbb{R}\to\mathbb{R}$ . These distances are used in the proof of the following Lemma, which quantifies the error of approximation implied by the local stability assumption (5).

Lemma 5.2.

Let $X_{t},\tilde{X}_{t}$ be two Lévy processes with characteristic triplets $(\mu,\sigma^{2},\nu)$ and $(\mu,\sigma^{2},\tilde{\nu})$ , respectively. Suppose furthermore that for some $\rho\in(0,1\wedge\alpha)$ ,

[TABLE]

There exists a constant $\tilde{C}$ depending on $L$ , $\rho$ , and $\theta$ , such that for any differentiable function $f:\mathbb{R}\to\mathbb{R}$ , and any $u>1$ ,

[TABLE]

where $\bar{\zeta}=\int\xi(z)(\nu-\tilde{\nu})(dz)\in\mathbb{R}$ . The constant $\tilde{C}$ is bounded on compacts in $\theta\in\Theta$ , $\rho\in(0,1\wedge\alpha)$ , and $L\geq 0$ .

Corollary 5.3.

Let $f\in\mathcal{C}^{3}$ such that $f,f^{\prime},f^{\prime\prime},f^{\prime\prime\prime}$ are bounded and $f^{\prime}\in L_{1}$ . Let $X_{t},\tilde{X}_{t}$ be two Lévy processes with characteristic triplets $(\mu,\sigma^{2},\nu)$ and $(0,\sigma^{2},\tilde{\nu})$ , respectively. Suppose that $\nu$ , $\tilde{\nu}$ satisfy the conditions of Lemma 5.2. Then, as $t\to 0$ ,

[TABLE]

The constant $\tilde{C}$ is bounded on compacts in $\mu\in\mathbb{R}$ , $\theta\in\Theta$ , $\rho\in(0,1\wedge\alpha)$ , and $L\geq 0$ .

Note that the presented result of 5.3 can not be directly formulated in terms of $d_{TV}$ or $d_{W}$ , distinguishing it from the results of Mariucci and Reiß, (2018). An alternative bound on the total variation distance between $X_{t}$ and $\tilde{Z}_{t}$ is presented by (Clément and Gloter,, 2018, Proposition 4) and (Amorino and Gloter,, 2019, Proposition 2), stating that $d_{TV}(X_{t},\tilde{Z}_{t})\leq Ct^{1\wedge\frac{1}{\alpha}}\log(t)$ as $t\to 0$ . Their assumptions on the Lévy measure $\nu(dz)$ imply that our condition (5) holds, with $\rho\leq(\alpha-1)\vee 0$ . Thus, if $\alpha>1$ and $u\ll t^{-1/2}$ , our bound (16) is sharper since $tu^{\alpha-1}\ll t^{\frac{3}{2}-\frac{\alpha}{2}}\ll t^{\frac{1}{\alpha}}$ . In the case $\alpha\leq 1$ , our bound is of the same order of magnitude as the one presented by Clément and Gloter, (2018) and Amorino and Gloter, (2019). Furthermore, our result may also be applied in the case $\rho>\alpha-1$ . However, we impose additional smoothness assumptions upon the considered function $f$ , which is suitable for our statistical purposes because the moment functions are chosen by the statistician.

To state the remaining technical results, introduce the notation

[TABLE]

such that

[TABLE]

Corollary 5.3 and Lemma 5.1 allow us to derive the following central limit theorem for the estimated moments. In particular, we use Lemma 5.1 to control the sampling variance, and Corollary 5.3 to control the bias.

Lemma 5.4.

Let $nh_{n}=T=1$ constant, i.e. $h_{n}=1/n$ , and choose $u_{n}\to\infty$ according to ** (U).**. Let $\boldsymbol{f}$ satisfy ** (F1).** and ** (F2).**, and suppose that the Lévy process $X_{t}$ satisfies (5) with some $\rho<\alpha/2$ . Then, as $n\to\infty$ ,

[TABLE]

Note that the rate of convergence for the first moment $f_{1}$ is slower than for $f_{j},j\geq 2$ . This is due to our special choice of $f_{j},j\geq 2$ , which vanish near zero. Hence, these moments are primarily driven by the jump component, which is of a smaller order than the diffusion term. On the other hand, the jump parameters $\alpha_{m},r_{m}^{\pm}$ are harder to identify, i.e. $\partial_{\alpha_{m}}\mathbb{E}_{\theta}\boldsymbol{f}(u\tilde{Z}_{h})\ll\partial_{\sigma^{2}}\mathbb{E}_{\theta}\boldsymbol{f}(u\tilde{Z}_{h})$ . This is established in the following Lemma.

Lemma 5.5.

Let $f\in\mathcal{C}^{2}(\mathbb{R})$ be such that $f,f^{\prime},f^{\prime\prime}$ are bounded. Let $\tilde{X}_{t}$ be a Lévy process with characteristic triplet $(0,\sigma^{2},\tilde{\nu})$ , parameterized by $\theta$ as in (7). Then, as $h\to 0$ , $u\to\infty$ , such that $hu^{2}\to 0$ ,

[TABLE]

and,

[TABLE]

Moreover, if $f$ vanishes on $[-\eta,\eta]$ and $u$ satisfies Condition ** (U).**,

[TABLE]

All terms of the form $\mathcal{O}(\cdot)$ and $o(\cdot)$ are bounded resp. vanishing uniformly on compacts in $\Theta$ .

Corollary 5.6.

Let $\boldsymbol{f}$ satisfy ** (F1).** and ** (F2).**, and let $\tilde{X}_{t}$ be a Lévy process with characteristic triplet $(0,\sigma,\tilde{\nu})$ , parameterized by $\theta$ as in (7). Then, as $h=\frac{1}{n}\to 0$ , $u_{n}\to\infty$ , such that $u_{n}=o(\sqrt{h})$ ,

[TABLE]

This convergence holds uniformly on compacts in $\theta\in\Theta$ .

These results allow us to establish the consistency of $\hat{\theta}_{n}$ . We do not consider global uniqueness of the solution of the estimating equation (8). Hence, we only obtain the existence of a consistent sequences of random variables satisfying the equation.

Lemma 5.7 (Consistency).

Let $X_{t}$ be a Lévy process satisfying (5) with some $\rho<\alpha/2$ , and parameter vector $\theta_{0}$ . Let $\boldsymbol{f}$ satisfy assumptions ** (F1).**, ** (F2).**, and ** (I).**, and let $u_{n}\to\infty$ be chosen according to ** (U).**. There exists a sequence of random vectors $\hat{\theta}_{n}$ solving (8), such that $\hat{\theta}_{n}\to\theta$ in probability as $n\to\infty$ . This sequence is eventually unique, i.e. for any other consistent sequence $\hat{\theta}_{n}^{*}$ solving the estimating equation, it holds $P(\hat{\theta}_{n}\neq\hat{\theta}_{n}^{*})\to 0$ .

To obtain a central limit theorem for $\hat{\theta}_{n}$ , we may apply a Taylor expansion to obtain the representation

[TABLE]

where $\widetilde{\mathrm{D}\boldsymbol{f}}_{j,k}=\partial_{\theta_{k}}\mathbb{E}_{\tilde{\theta}^{j}}f_{j}(u_{n}\tilde{Z}_{h})$ for some $\tilde{\theta}^{j}$ on the line segment between $\theta_{0}$ and $\hat{\theta}_{n}$ , for $j=1,\ldots,3M+1$ . This standard approach allows to establish Theorem 2.1, as detailed in Subsection 5.2.

5.2 Proofs

Proof of Lemma 5.1.

At the price of changing the term $\mu$ , we may assume w.l.o.g. that $\xi(z)=z\mathds{1}_{|z|\leq 1}$ . In view of the Lévy-Itô decomposition (4), we write

[TABLE]

where $N$ is a Poisson counting measure with intensity $\tilde{\nu}(dz)\otimes ds$ , and $J_{t}^{u}$ denotes the corresponding integral term. The explicit form of $\tilde{\nu}$ allows for computation of $\mu_{u}$ , as

[TABLE]

The term $\log(u)$ is added to cover the case $\alpha_{1}=1$ . This bound on $\mu_{u}$ will be used in the sequel.

To derive the claims of the Lemma, we start with a rough bound for the probability

[TABLE]

The first term tends to zero identically as $t\to 0$ . To study the jump term, choose a bounded, smooth function $g(x)\geq\mathds{1}_{|x|\geq\lambda\eta}$ such that $g(0)=g^{\prime}(0)=0$ . Then by Itô’s formula, and a substitution in the integral, we obtain

[TABLE]

for a constant $\tilde{C}$ depending on $\boldsymbol{\alpha},\boldsymbol{r}$ and is bounded on compacts in these parameters. The function $g$ can be chosen such that the latter term is finite. Thus, $P(|u\tilde{X}_{t}|>\lambda\eta)=\mathcal{O}(u^{\alpha}t)$ , uniformly on compacts in $\boldsymbol{\alpha},\boldsymbol{r}$ .

For the Gaussian term in (21), we employ the tail bound

[TABLE]

Now let $a>0$ be such that $u=\frac{(1-\lambda)\eta}{\sqrt{a}\sigma\sqrt{8t|\log t|}}$ . Then

[TABLE]

If $a\geq 1$ , i.e. $u\leq\frac{(1-\lambda)\eta}{\sigma\sqrt{8t|\log t|}}$ , the latter bound is of order less than $\mathcal{O}(u^{\alpha}t)$ , uniformly on compacts. In particular,

[TABLE]

Note that the latter inequality does not hold if $u=\tau/\sqrt{-t\log t}$ for a proportionality factor $\tau$ which is too large.

If $u$ is larger, but $u=o(1/\sqrt{t})$ ,the bound on $P(|J_{t}^{u}|>\lambda\eta)$ remains unchanged, while we still obtain $P(|u\sigma B_{t}|>\eta)\to 0$ uniformly on compacts. Thus, if we only suppose $u=o(1/\sqrt{t})$ , we have $P(|u\tilde{X}_{t}|>\tilde{\eta})\to 0$ uniformly on compacts, for any $\tilde{\eta}>0$ , but with a slower rate.

To obtain an asymptotically exact value, we plug the former rough bound into Itô’s formula. In case (i), we have

[TABLE]

Here, we used $\mathbb{E}f^{\prime\prime}(u\tilde{X}_{s})\leq\|f^{\prime\prime}\|_{\infty}P(|u\tilde{X}_{s}|>\eta)=\mathcal{O}(u^{\alpha}t)$ as $f$ vanishes on $[-\eta,\eta]$ . We moreover used that $\mathbb{E}f^{\prime}(u\tilde{X}_{s})=\mathcal{O}(u^{\alpha}t)$ , and $\mu_{u}u=\mathcal{O}(u^{2}t)$ as established previously. These upper bounds hold uniformly on compacts in $\Theta$ . To proceed, note that $\mathcal{J}_{\alpha}^{\pm}f$ is a bounded continuous function, since

[TABLE]

which is furthermore bounded uniformly on compacts in $\alpha$ . By virtue of this boundedness, $u\tilde{X}_{s}\xrightarrow{P}0$ implies $\mathbb{E}\mathcal{J}_{\alpha_{m}}^{\pm}f(u\tilde{X}_{s})=\mathcal{J}_{\alpha_{m}}^{\pm}f(0)+o(1)$ . To ensure that this last approximation holds uniformly on compacts in $\Theta$ , note that $\|(\mathcal{J}_{\alpha_{m}}^{\pm}f)^{\prime}\|_{\infty}=\|\mathcal{J}_{\alpha_{m}}^{\pm}f^{\prime}\|_{\infty}$ is also bounded, such that it suffices to control $\mathbb{E}(|u\tilde{X}_{s}|\wedge 1)$ uniformly. But we already established that for any $\eta$ , $P(|u\tilde{X}_{s}|>\eta)\to 0$ uniformly on compacts in $\Theta$ . Hence,

[TABLE]

uniformly on compacts in $\sigma^{2},\boldsymbol{\alpha},\boldsymbol{r}$ . This proves the first claim.

If, on the other hand, $f(0)=0,f^{\prime\prime}(0)\neq 0$ , a different term dominates in (22). We obtain

[TABLE]

uniformly on compacts in $\Theta$ .

For the case $f^{\prime\prime}(0)=0,f^{(4)}(0)\neq 0$ , we may apply the result of case (ii) to obtain $\mathbb{E}f^{\prime\prime}(u\tilde{X}_{t})=\frac{u^{2}t\sigma^{2}}{2}f^{(4)}(0)+o(u^{2}t)$ , and hence

[TABLE]

For the last claim, we use Itô’s formula again. Recall that the truncation function satisfies $\xi(z)=z$ for $|z|\leq 1$ , and $|\xi(z)|\leq 2$ . Then

[TABLE]

The additional factor $\log(u)$ is introduced to cover the special case $\alpha=1$ when computing the integral $\int_{1/u}^{1}|z|^{-\alpha}dz$ . ∎

Proof of Lemma 5.2.

Choose some $0<\epsilon<\frac{1}{u}$ . The process $X_{t}$ may be decomposed by virtue of the Lévy-Itô decomposition as

[TABLE]

where $(N-\nu)$ is a compensated homogeneous Poisson point process with intensity measure $\nu(dz)$ , such that $J_{t}^{1}$ is a martingale. For $\tilde{X}_{t}$ , we have the analogous decomposition $\tilde{X}_{t}=\mu t+\sigma B_{t}+\tilde{J}^{1}_{t}+\tilde{J}^{2}_{t}+\tilde{J}^{3}_{t}+t\tilde{\zeta}_{\epsilon}$ . Moreover,

[TABLE]

The second integral is finite. Furthermore, integrating by parts,

[TABLE]

which has a limit as $\epsilon\to 0$ if $\rho<1$ . Thus, there exists a real number $\bar{\zeta}$ such that $\zeta_{\epsilon}-\tilde{\zeta}_{\epsilon}\to\bar{\zeta}$ as $\epsilon\to 0$ .

By subadditivity of the total variation distance and the Wasserstein distance,

[TABLE]

We treat all terms in (LABEL:eqn:expect-diff) individually.

Part (i) The small jumps can be handled by noting

[TABLE]

Since $J^{1}_{t}$ and $\tilde{J}^{1}_{t}$ have bounded jumps, we have $\mathbb{E}|J^{1}_{t}|^{2},\mathbb{E}|\tilde{J}^{1}_{t}|^{2}\to 0$ as $\epsilon\to 0$ . Furthermore, $|\bar{\zeta}-(\zeta_{\epsilon}-\tilde{\zeta}_{\epsilon})|\to 0$ as $\epsilon\to 0$ .

Part (ii) As a next step, we study the medium sized jumps $J^{2}_{t}$ . Consider the slightly more general process

[TABLE]

for $0<a<b<1$ . Let $\tilde{J}_{t}^{(a,b]}$ be defined analogously based on $\tilde{X}_{t}$ . These are compound Poisson processes, which can be written as

[TABLE]

where $N_{t}$ is a Poisson counting process with intensity $\eta((a,b])=\nu([-b,-a)\cup(a,b])$ , and the $U_{i}$ are iid random variables with distribution $\frac{\nu(dz)\mathds{1}(a<|z|\leq b)}{\eta((a,b])}$ . Vice versa, the same holds for $\tilde{N}_{t}$ and $\tilde{U}_{i}$ with $\tilde{\eta}((a,b])=\tilde{\nu}([-b,-a)\cup(a,b])$ . Then Theorem 10 and Proposition 3 of Mariucci and Reiß, (2018) for $p=1$ , yield

[TABLE]

We compute

[TABLE]

Recall that $\tilde{\eta}((z,b])=\sum_{m=1}^{M}(r_{m}^{+}+r_{m}^{-})(|z|^{-\alpha_{m}}-b^{-\alpha_{m}})$ . Then there exists a constant $\tilde{C}$ which is bounded on compacts in $\Theta$ and $L$ , such that for $z<b/2$ , and $\alpha=\alpha_{1}$ ,

[TABLE]

In particular, this yields $\mathbb{E}|\tilde{U}_{1}|\leq\tilde{C}a^{1\wedge\alpha}$ for a potentially different constant $\tilde{C}$ . here and in the following, the constant $\tilde{C}$ may vary from line to line, and is bounded on compacts in $\theta$ , $L$ , and $\rho$ .

Furthermore, since $\nu$ and $\tilde{\nu}$ are sufficiently similar,

[TABLE]

for $|\xi|\leq 2L(a^{-\rho}+b^{-\rho})\leq 4La^{-\rho}$ . Thus, the second term in (25) is of order $\mathcal{O}(ta^{(1\wedge\alpha)-\rho})$ . Moreover, $|\eta((a,b])|\leq\tilde{C}(a^{-\alpha}+a^{-\rho})=\mathcal{O}(a^{-\alpha})$ for small $a$ , since $\rho<\alpha$ .

We now consider the distance $d_{W}(U_{1},\tilde{U}_{1})$ occurring in (25), which can be expressed in terms of their cumulative distribution functions as

[TABLE]

For $-b\leq v<-a$ , and $b\leq 1$ , it holds

[TABLE]

Recall that $|\eta((a,b])-\tilde{\eta}((a,b])|=\mathcal{O}(a^{-\rho})$ . Furthermore, the assumed similarity of $\nu$ and $\tilde{\nu}$ implies that $|\nu([-b,v])-\tilde{\nu}([-b,v])|\leq L(|v|^{-\rho}+b^{-\rho})\leq 2L|v|^{-\rho}$ , and

[TABLE]

as $a\to 0$ , whenever $b\geq 2a$ . In this case, for $-b\leq v\leq-a$ ,

[TABLE]

The analogous bound holds for $|P(U_{1}>v)-P(\tilde{U}_{1}>v)|$ , when $a\leq v\leq b$ . Now plug (31) into expression (LABEL:eqn:Wasserstein-U) for the Wasserstein distance, to obtain for $a\to 0$ and $a\leq\frac{b}{2}$ ,

[TABLE]

where we used $\rho<1$ . Using (25), we may hence bound,

[TABLE]

This upper bound will be exploited in the rest of the proof. In particular, for $J_{t}^{2}=J_{t}^{(\epsilon,1/u]}$ and $\epsilon$ small enough,

[TABLE]

Part (iii) It remains to study the term in (LABEL:eqn:expect-diff) due to the large jumps. Here, our approach is slightly different as we will not (only) bound a metric distance between $J_{t}^{3}$ and $\tilde{J}_{t}^{3}$ . Define

[TABLE]

and we consider $\left|\mathbb{E}f_{u,t}(J_{t}^{3})-\mathbb{E}f_{u,t}(\tilde{J}_{t}^{3})\right|$ , as suggested by (LABEL:eqn:expect-diff). Since $J_{t}^{3}$ is a Lévy process, Itô’s formula yields

[TABLE]

i.e., $\mathcal{J}^{3}$ is the infinitesimal generator of $J_{t}^{3}$ . Analogously, we denote by $\tilde{\mathcal{J}}^{3}$ the generator of $\tilde{J}_{t}^{3}$ . Then integration by parts yields, for any $x\in\mathbb{R}$ ,

[TABLE]

The same bound holds for the range of integration $z\in(-\infty,-1/u)$ , such that

[TABLE]

Now note that,

[TABLE]

such that by Fubini’s theorem,

[TABLE]

where we performed a linear substitution in the second step. Hence,

[TABLE]

Using this in (34),

[TABLE]

We now study the latter two terms.

Part (iv) The total variation distance can be bounded by noting that $J_{t}^{(1,\infty)}$ and $\tilde{J}_{t}^{(1,\infty)}$ admit only finitely many jumps. The number of their jumps is Poisson distributed, such that

[TABLE]

In particular,

[TABLE]

Moreover,

[TABLE]

Via the same argument, we also obtain

[TABLE]

From (32), we know that

[TABLE]

In combination with (LABEL:eqn:Ef-J3), we thus obtain

[TABLE]

Part (v) Now putting (24), (33), and (40) into (LABEL:eqn:expect-diff), and letting $\epsilon\to 0$ ,

[TABLE]

It can be checked that the upper bounds which are summarized in the constant $\tilde{C}$ all satisfy the desired uniformity on compacts in $\boldsymbol{\alpha}$ , $\boldsymbol{r}$ , $L$ , and $\rho-\alpha<0$ . This concerns the lines (26), (29), (30), (35), (37), (38), (39). ∎

Proof of Corollary 5.3.

Assume $f(0)=0$ without loss of generality. A Taylor expansion yields, for any $a\in\mathbb{R}$ ,

[TABLE]

We denote $\tilde{X}_{t}=\sigma B_{t}+\tilde{J}_{t}$ , where $\tilde{J}_{t}$ is the purely discontinuous component of $\tilde{X}$ . Introduce for any function $g$ the notation $g_{[u]}(x)=\mathbb{E}g(u\sigma B_{t}+x)$ . Then for any $k$ -th derivative, $\|g_{[u]}^{(k)}\|_{\infty}\leq\|g^{(k)}\|_{\infty}$ . In particular, by Lemma 5.1,

[TABLE]

such that

[TABLE]

Moreover, $|\mathbb{E}f(uX_{t})-\mathbb{E}f(u(\tilde{X}_{t}+t\mu-t\bar{\zeta}))|\leq\tilde{C}(tu^{\rho}+t^{2}u^{\alpha+1})$ from Lemma 5.2. Applying (42) for the drift $a=\mu+\zeta_{0}$ , this yields (16). ∎

Proof of Lemma 5.4.

All summands $\boldsymbol{f}(\Delta_{n,i}X)$ are iid and bounded and $\tilde{\Lambda}_{n}^{-1}/\sqrt{n}\to 0$ , such that the Lindeberg-Feller condition for triangular arrays of independent r.v.s is satisfied (Durrett,, 2005, Thm. 2.4.5). Moreover, the bias is of order $|\mathbb{E}\boldsymbol{f}({u}_{n}\Delta X_{t_{i}})-\mathbb{E}\boldsymbol{f}_{j}({u}_{n}\tilde{Z}_{h})|=\mathcal{O}(h_{n}u_{n}^{\rho})$ by Corollary 5.3. If $\rho<\alpha/2$ , this is small enough to ensure $\Lambda_{n}^{-1}\sqrt{n}|\mathbb{E}\boldsymbol{f}(\Delta_{n,i}X)-\mathbb{E}_{\theta}\boldsymbol{f}(u_{n}\tilde{Z}_{h})|=o(1)$ . Hence, the bias is asymptotically negligible.

It thus suffices to check the asymptotic covariance structure. Denote $f_{j,k}(x)=f_{j}(x)f_{k}(x)$ . Then $f_{j,k}$ is smooth and vanishes on $[-\eta,\eta]$ unless $j=1=k$ . Moreover, $f_{1,1}(0)=f_{1,1}^{\prime}(0)=f_{1,1}^{\prime\prime}(0)=0$ and $f_{1,1}^{(4)}(0)=6f_{1}^{\prime\prime}(0)^{2}$ . Corollary 5.3 and Lemma 5.1 yield

[TABLE]

To compute the asymptotic covariance, we further determine

[TABLE]

and for $j\geq 2,k\geq 1$ ,

[TABLE]

These approximations can be summarized as

[TABLE]

This scaling behavior yields $\operatorname{Cov}_{\theta}(\tilde{\Lambda}_{n}^{-1}(\theta)\boldsymbol{f}(\Delta_{n,i}X))\to\Sigma(\theta)$ as $n\to 0$ , and thus the desired central limit theorem. ∎

Proof of Lemma 5.5.

First, assume $f$ to be a Schwartz function with Fourier transform $\hat{f}(\lambda)$ . Then

[TABLE]

where $\psi_{\theta}$ is the Lévy symbol of $\tilde{X}_{h}$ , i.e. $\mathbb{E}_{\theta}\exp(i\lambda\tilde{X}_{h})=\exp(-h\psi_{\theta}(\lambda))$ . In particular, for any entry $\theta_{j}$ of the parameter vector $\theta$ ,

[TABLE]

Integration and differentiation may be exchanged because $f$ is a Schwartz function and $\psi$ has polynomial growth. In particular, via the Lévy-Khintchine formula, the Lévy symbol may be determined as

[TABLE]

The second term appears because the Lévy measure $\tilde{\nu}$ is allowed to be asymmetric. In its expression, we used that $\xi(z)=z$ for $z\in(-1,1)$ , and denote

[TABLE]

Hence, by inverting the Fourier transform,

[TABLE]

So far, we assumed $f$ to be a Schwartz function, but the right hand side of (44) makes sense whenever $f\in\mathcal{C}^{2}$ . We can extend the whole equation (44) to this case by approximating $f$ suitably with a sequence of Schwartz functions $f_{n}$ , such that $\sup_{|x|\leq K}|f^{(k)}_{n}(x)-f^{(k)}(x)|\to 0$ as $n\to\infty$ for each $K>0$ , and $k=0,1,2$ , and $\sup_{n}\|f_{n}^{(k)}\|_{\infty}<\infty$ . Hence, standard arguments allow us to pass to the limit on both sides of the equation (44)

To handle the asymmetry term $\bar{\xi}_{u}$ , we exploit (43) to derive

[TABLE]

The second integral can be bounded as follows. For any $\epsilon\in(0,1)$ and any $p\neq 1$ , there is a $\tilde{p}$ between $p$ and $1$ such that

[TABLE]

By continuity, the same bound holds for $p=1$ . Thus, we obtain

[TABLE]

Similarly,

[TABLE]

Note also that $\partial_{\sigma^{2}}\overline{\xi}_{u}=0$ .

For specific partial derivatives, we thus have shown that

[TABLE]

For fixed $f$ , the functions $f^{\prime\prime}$ , $\mathcal{J}_{\alpha_{m}}^{\pm}f$ and $\partial_{\alpha_{m}}\mathcal{J}_{\alpha_{m}}^{\pm}f$ are bounded, uniformly on compacts in $\theta$ . Moreover, $P_{\theta}(|u\tilde{X}_{h}|>\eta)\to 0$ uniformly on compacts in $\Theta$ for any $\eta$ , as established in the proof of Lemma 5.1. Therefore, $\mathbb{E}_{\theta}f^{\prime\prime}(u\tilde{X}_{h})\to f^{\prime\prime}(0)$ uniformly on compacts as $h\to 0$ , as well as $\mathbb{E}_{\theta}\mathcal{J}_{\alpha_{m}}^{\pm}f(u\tilde{X}_{h})\to\mathcal{J}_{\alpha_{m}}^{\pm}f(0)$ and $\mathbb{E}_{\theta}\partial_{\alpha_{m}}\mathcal{J}_{\alpha_{m}}^{\pm}f(u\tilde{X}_{h})\to\partial_{\alpha_{m}}\mathcal{J}_{\alpha_{m}}^{\pm}f(0)$ . This completes the proof of (17), and (18) follows analogously by applying a linear transformation to (45). Finally, (19) is a consequence of (45) upon noting that $\mathbb{E}_{\theta}f^{\prime\prime}(u\tilde{X}_{h})=\mathcal{O}(hu^{\alpha})$ , see Lemma 5.1. ∎

Proof of Corollary 5.6.

Since $f_{1}^{\prime}$ is bounded, (17) shows that

[TABLE]

This corresponds to the entries $A(\theta)_{1,k}=0$ for $k\geq 2$ . For $j\geq 2$ , we have $\mathbb{E}_{\theta}f^{\prime}_{j}(u\tilde{X}_{h})=\mathcal{O}(hu^{\alpha})$ by virtue of Lemma 5.1, since $f_{j}$ vanishes near zero. Hence, since $\alpha_{m}>\alpha/2$ and $u\leq\mathcal{O}(\sqrt{h})$ ,

[TABLE]

This corresponds to the entries $A(\theta)_{j,1}=0$ for $j\geq 2$ . In combination with Lemma 5.5, this suffices to establish the convergence (20). ∎

Proof of Lemma 5.7.

Denote the estimating equation (8) as $F_{n}(\hat{\theta}_{n})=0$ , for

[TABLE]

Let $\theta_{0}$ be the true parameters, and reparameterize $\theta=\theta_{0}+\Gamma_{n}(\theta_{0})\bar{\Lambda}^{-1}_{n}(\theta_{0})T$ for $T=\bar{\Lambda}_{n}(\theta_{0})\Gamma_{n}^{-1}(\theta_{0})(\theta-\theta_{0})$ , and let

[TABLE]

This is well defined whenever $T\in B_{d_{n}}(0)$ , for $d_{n}={c}\sqrt{h_{n}}u_{n}^{\alpha_{M}-\frac{\alpha_{1}}{2}}/(\log u_{n})^{3}\to 0$ , and $c>0$ sufficiently small. In this reparameterized model, we need to show that there exists a sequence of random vectors $\hat{T}_{n}\in B_{d_{n}}$ such that $\bar{F}_{n}(\hat{T}_{n})=0$ for large $n$ , and $\Gamma_{n}(\theta_{0})\bar{\Lambda}_{n}^{-1}(\theta_{0})\hat{T}_{n}\to 0$ . This will imply that $\|\hat{\theta}_{n}-\theta_{0}\|\leq C/(\log u_{n})^{2}$ for a sufficiently large factor $C$ .

We know from Lemma 5.4 that

[TABLE]

Furthermore,

[TABLE]

By Corollary 5.6, $\tilde{\Lambda}_{n}^{-1}(\theta)\mathrm{D}_{\theta}F_{n}(\theta)\Gamma_{n}(\theta)\bar{\Lambda}_{n}^{-1}(\theta)\to A(\theta)$ locally uniformly, and it can be checked that $\theta\mapsto A(\theta)$ is continuous. Moreover, the definitions of $\tilde{\Lambda}_{n},\bar{\Lambda}_{n}$ , and $\Gamma_{n}$ readily yield, as $n\to\infty$ ,

[TABLE]

Here, we denote by $\|\cdot\|$ the spectral norm of a matrix, i.e. $\|A\|^{2}$ is the largest absolute eigenvalue of the symmetrized matrix $A^{T}A$ , and $\mathbf{I}_{d}$ denotes the $d\times d$ identity matrix. Thus,

[TABLE]

Now we apply (Jacod and Sørensen,, 2018, Lemma 6.2) to establish the existence of a solution $\hat{T}_{n}\in B_{d_{n}^{*}}(0)$ of the equation $\bar{F}_{n}(\hat{T}_{n})=0$ . Let $\lambda=\frac{1}{2}\|A(\theta_{0})^{-1}\|^{-1}$ , and denote by $C_{n}$ the event

[TABLE]

Since the first set is deterministic, and since $\|\bar{F}_{n}(0)\|/d_{n}\xrightarrow{P}0$ , we have $P(C_{n})\to 1$ . On the set $C_{n}$ , it holds that $0\in\overline{B}_{\lambda d_{n}}(\bar{F}_{n}(0))$ . Then Lemma 6.2 of Jacod and Sørensen, (2018) with $y=0,f=\bar{F}_{n}$ and $r=d_{n}$ , states that there exists a unique point $\hat{T}_{n}\in\overline{B}_{d_{n}}(0)$ which solves $\bar{F}_{n}(\hat{T}_{n})=0$ .

Returning to the original parametrization, we conclude there exists a random variable $\hat{\theta}_{n}$ such that with probability at least $1-P(C_{n})\to 0$ , $\hat{\theta}_{n}$ solves the estimating equation and $\hat{\theta}_{n}-\theta_{0}\in\Gamma_{n}(\theta_{0})\bar{\Lambda}_{n}^{-1}(\theta_{0})\overline{B}_{d_{n}}(0)$ , i.e. $\hat{\theta}_{n}-\theta_{0}=\mathcal{O}_{P}(1/\log u_{n})$ . Theorem 2.1 below establishes that any consistent sequence $\hat{\theta}_{n}^{*}$ converges at a rate faster than $1/\log u_{n}$ , such that $\hat{T}_{n}^{*}=\Gamma_{n}(\theta_{0})^{-1}\bar{\Lambda}_{n}^{-1}(\theta_{0})(\hat{\theta}_{n}^{*}-\theta_{0})\in\overline{B}_{d_{n}}(0)$ eventually. Hence, the uniqueness of $\hat{T}_{n}$ on $\overline{B}_{d_{n}^{*}}(0)$ implies the uniqueness of $\hat{\theta}_{n}$ , i.e. $P(\hat{\theta}_{n}^{*}\neq\hat{\theta}_{n})=P(\hat{T}_{n}^{*}\neq\hat{T}_{n})\to 0$ . ∎

Proof of Theorem 2.1.

Denote the estimating equation as $F_{n}(\theta)=0$ , for $F_{n}(\theta)$ as in (46). The mean value theorem yields

[TABLE]

where $(\widetilde{F}_{n})_{j,k}=\partial_{\theta_{k}}(F_{n})_{j}(\tilde{\theta}^{j})$ for some $\tilde{\theta}^{j}$ on the line segment between $\theta_{0}$ and $\hat{\theta}_{n}$ . Denote by $R_{n}\subset\Omega$ the event that $A_{n}=\tilde{\Lambda}_{n}(\theta_{0})^{-1}\widetilde{F_{n}}\Gamma_{n}(\theta_{0})\bar{\Lambda}_{n}(\theta_{0})^{-1}$ is regular, and introduce furthermore the matrices

[TABLE]

That is, the $j$ -th row of $A_{n}$ and $A_{n}^{j}$ coincide, $(A_{n})_{j,k}=(A_{n}^{j})_{j,k}$ . Now note that $\|\tilde{\theta}^{j}-\theta_{0}\|\leq\|\hat{\theta}-\theta_{0}\|=\mathcal{O}_{P}(1/(\log u_{n})^{2})$ , and for any $C>0$ , as in (47),

[TABLE]

Together with the locally uniform convergence of Corollary 5.6, this yields $A_{n}^{j}\xrightarrow{P}A(\theta_{0})$ for each $j$ , and thus $A_{n}\xrightarrow{P}A(\theta_{0})$ .

In particular, $P(R_{n})\to 1$ , and on the set $R_{n}$ , we may rewrite

[TABLE]

But $\sqrt{n}\tilde{\Lambda}_{n}^{-1}F_{n}(\theta_{0})\Rightarrow\mathcal{N}(0,\Sigma(\theta_{0}))$ by Lemma 5.4, and $A_{n}^{-1}\to A^{-1}(\theta)$ in probability, such that Slutsky’s lemma completes the proof. ∎

Proof of Proposition 3.1.

We show how to adjust the proof of Aït-Sahalia and Jacod, (2012) to consider the off-diagonal entries. Denote by $\varphi_{\alpha}$ the density of a symmetric $\alpha$ -stable random variable, standardized to have Lévy measure $\alpha|x|^{-1-\alpha}dx$ . This is the same parametrization as implied by (6). Furthermore, let $\varphi$ be the density of a standard normal distribution. Then the probability density of $\tilde{Z}_{h}$ is given by the convolution

[TABLE]

Now introduce the terms

[TABLE]

and

[TABLE]

Some technical integral transformations, explained in more detail by Aït-Sahalia and Jacod, (2012) (cf. (A.3) therein), establish that

[TABLE]

The main workload of the proof given by Aït-Sahalia and Jacod, (2012) derives the limiting behavior of $J_{h}^{l,m}$ as $h\to 0$ . They show that

[TABLE]

where

[TABLE]

Using furthermore that $v_{h}\to\frac{2}{\alpha(2-\alpha)}$ , this yields

[TABLE]

Some straightforward manipulations show that

[TABLE]

This limiting matrix is singular. The off-diagonal entry $\mathcal{I}_{h}^{\alpha,r}$ has not been considered by Aït-Sahalia and Jacod, (2012). ∎

Proof of Proposition 3.2.

Denote the true parameter by $\alpha_{0,m}$ and $r_{0,m}^{\pm}$ , respectively. By Lemma 5.5, we have as $n\to\infty$ , $h=1/n\to 0$ ,

[TABLE]

This convergence holds uniformly on compacts in $\Theta$ . The limits are positive because $r_{m}^{+}+r_{m}^{-}>0$ by the definition of $\Theta$ , and $\mathcal{J}_{\alpha_{m}}^{\pm}f(0)>0$ by assumption. Moreover, Lemma 5.4 also holds for $\tilde{F}_{n}$ , i.e.

[TABLE]

Thus, the existence of a consistent sequence of estimators follows along the same lines as Lemma 5.7.

For the central limit theorem, we use the mean value theorem to obtain, for a value $\tilde{\alpha}_{m}$ between $\alpha_{0,m}$ and $\hat{\alpha}_{m}$ ,

[TABLE]

In particular, $(\hat{\alpha}_{m}-\alpha_{0,m})=-(\partial_{\alpha_{m}}\tilde{F}_{n}(\tilde{\alpha}_{m}))^{-1}\tilde{F}_{n}(\alpha_{0,m})$ . Just as in the proof of Theorem 2.1, we may use the convergence of $\partial_{\alpha_{m}}\tilde{F}_{n}(\alpha_{m})$ and the central limit theorem (49) to derive the asymptotic distribution of $\hat{\alpha}_{m}$ by means of Slutsky’s Lemma. Analogously for $r_{m}^{\pm}$ . ∎

Bibliography24

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Aït-Sahalia and Jacod, (2008) Aït-Sahalia, Y. and Jacod, J. (2008). Fisher’s information for discretely sampled Lévy processes. Econometrica , 76(4):727–761.
2Aït-Sahalia and Jacod, (2009) Aït-Sahalia, Y. and Jacod, J. (2009). Estimating the degree of activity of jumps in high frequency data. The Annals of Statistics , 37(5A):2202–2244.
3Aït-Sahalia and Jacod, (2012) Aït-Sahalia, Y. and Jacod, J. (2012). Identifying the successive Blumenthal–Getoor indices of a discretely observed process. The Annals of Statistics , 40(3):1430–1464.
4Amorino and Gloter, (2018) Amorino, C. and Gloter, A. (2018). Contrast function estimation for the drift parameter of ergodic jump diffusion process. ar Xiv preprint , 1807.08965.
5Amorino and Gloter, (2019) Amorino, C. and Gloter, A. (2019). Unbiased truncated quadratic variation for volatility estimation in jump diffusion processes. ar Xiv preprint , 1904.10660.
6Andersen et al., (2002) Andersen, T. G., Benzoni, L., and Lund, J. (2002). An empirical investigation of continuous-time equity return models. The Journal of Finance , 57(3):1239–1284.
7Blumenthal and Getoor, (1961) Blumenthal, R. M. and Getoor, R. K. (1961). Sample functions of stochastic processes with stationary independent increments. Journal of Mathematics and Mechanics , 10(3):493–516.
8Brouste and Fukasawa, (2018) Brouste, A. and Fukasawa, M. (2018). Local asymptotic normality property for fractional Gaussian noise under high-frequency observations. The Annals of Statistics , 46(5):2045–2061.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Rate-optimal estimation of the Blumenthal–Getoor index of a Lévy process

Abstract

1 Introduction

1.1 Notation

2 Model and estimator

Condition** (F1).**

Condition** (F2).**

Condition** (U).**

Condition** (I).**

Remark 1*.*

Theorem 2.1**.**

3 Asymptotic optimality

Proposition 3.1**.**

Proposition 3.2**.**

4 Simulation study

5 Technical tools

5.1 Preliminary results

Lemma 5.1**.**

Lemma 5.2**.**

Corollary 5.3**.**

Lemma 5.4**.**

Lemma 5.5**.**

Corollary 5.6**.**

Lemma 5.7** (Consistency).**

5.2 Proofs

Proof of Lemma 5.1.

Proof of Lemma 5.2.

Proof of Corollary 5.3.

Proof of Lemma 5.4.

Proof of Lemma 5.5.

Proof of Corollary 5.6.

Proof of Lemma 5.7.

Proof of Theorem 2.1.

Proof of Proposition 3.1.

Proof of Proposition 3.2.

Condition (F1).

Condition (F2).

Condition (U).

Condition (I).

*Remark 1**.*

Theorem 2.1.

Proposition 3.1.

Proposition 3.2.

Lemma 5.1.

Lemma 5.2.

Corollary 5.3.

Lemma 5.4.

Lemma 5.5.

Corollary 5.6.

Lemma 5.7 (Consistency).