Minimax rates for the covariance estimation of multi-dimensional L\'evy   processes with high-frequency data

Katerina Papagiannouli

arXiv:1903.06585·math.ST·September 24, 2019

Minimax rates for the covariance estimation of multi-dimensional L\'evy processes with high-frequency data

Katerina Papagiannouli

PDF

TL;DR

This paper develops a spectral estimator for co-integrated volatility in multi-dimensional Lévy processes using high-frequency data, establishing minimax convergence rates and comparing efficiency with existing methods.

Contribution

It introduces a new spectral estimator for co-integrated volatility and proves its minimax optimality for a broad class of Lévy processes.

Findings

01

Convergence rates are 1/√n for r ≤ 1 and (n log n)^{(r-2)/2} for r > 1.

02

The estimator is minimax optimal within the specified class.

03

The method effectively bounds co-jump activity using the harmonic mean.

Abstract

This article studies nonparametric methods to estimate the co-integrated volatility for multi-dimensional L\'evy processes with high frequency data. We construct a spectral estimator for the co-integrated volatility and prove minimax rates for an appropriate bounded nonparametric class of semimartingales. Given $n$ observations of increments over intervals of length $1/ n$ , the rates of convergence are $1/ n$ if $r \leq 1$ and $(n lo g n)^{(r - 2) /2}$ if $r > 1$ , which are optimal in a minimax sense. We bound the co-jump index activity from below with the harmonic mean. Finally, we assess the efficiency of our estimator by comparing it with estimators in the existing literature.

Figures11

Click any figure to enlarge with its caption.

Equations293

B (r) = \int_{R^{2}} (1 \land ∥ x ∥^{r}) F (d x), I = {0 < r < 2 : B (r) < \infty}, r^{*} = in f I

B (r) = \int_{R^{2}} (1 \land ∥ x ∥^{r}) F (d x), I = {0 < r < 2 : B (r) < \infty}, r^{*} = in f I

\int_{R^{2}} (1 \land ∣ x_{1} x_{2} ∣^{r /2}) F (d x_{1}, d x_{2}) < \infty.

\int_{R^{2}} (1 \land ∣ x_{1} x_{2} ∣^{r /2}) F (d x_{1}, d x_{2}) < \infty.

X_{t} = b t + W_{t} + \int_{0}^{t} \int_{∥ x ∥ \leq 1} x (μ - \tilde{μ}) (d s, d x) + \int_{0}^{t} \int_{∥ x ∥ > 1} x μ (d s, d x) .

X_{t} = b t + W_{t} + \int_{0}^{t} \int_{∥ x ∥ \leq 1} x (μ - \tilde{μ}) (d s, d x) + \int_{0}^{t} \int_{∥ x ∥ > 1} x μ (d s, d x) .

V a r (W_{t}^{(1)}) = ⟨ W_{t}^{(1)}, W_{t}^{(1)} ⟩ = t

V a r (W_{t}^{(1)}) = ⟨ W_{t}^{(1)}, W_{t}^{(1)} ⟩ = t

V a r (W_{t}^{(2)}) = ⟨ W_{t}^{(2)}, W_{t}^{(2)} ⟩ = ρ^{2} t + (1 - ρ^{2}) t = t .

V a r (W_{t}^{(2)}) = ⟨ W_{t}^{(2)}, W_{t}^{(2)} ⟩ = ρ^{2} t + (1 - ρ^{2}) t = t .

C o v (W_{t}^{(1)}, W_{t}^{(2)}) = ⟨ W_{t}^{(1)}, W_{t}^{(2)} ⟩ = ρ ⟨ W_{t}^{(1)}, W_{t}^{(1)} ⟩ + 1 - ρ^{2} ⟨ W_{t}^{(1)}, W_{t}^{(3)} ⟩ = ρt;

C o v (W_{t}^{(1)}, W_{t}^{(2)}) = ⟨ W_{t}^{(1)}, W_{t}^{(2)} ⟩ = ρ ⟨ W_{t}^{(1)}, W_{t}^{(1)} ⟩ + 1 - ρ^{2} ⟨ W_{t}^{(1)}, W_{t}^{(3)} ⟩ = ρt;

Σ = (σ^{(1)} ρ σ^{(2)} 0 1 - ρ^{2} σ^{(2)}) \mbox so t ha t Σ Σ^{⊤} = ((σ^{(1)})^{2} ρ σ^{(1)} σ^{(2)} ρ σ^{(1)} σ^{(2)} (σ^{(2)})^{2})

Σ = (σ^{(1)} ρ σ^{(2)} 0 1 - ρ^{2} σ^{(2)}) \mbox so t ha t Σ Σ^{⊤} = ((σ^{(1)})^{2} ρ σ^{(1)} σ^{(2)} ρ σ^{(1)} σ^{(2)} (σ^{(2)})^{2})

⟨ X_{t}^{(1)}, X_{t}^{(2)} ⟩ = \int_{0}^{t} ρ σ^{(1)} σ^{(2)} d s + s \leq t \sum Δ X_{s}^{(1)} Δ X_{s}^{(2)}

⟨ X_{t}^{(1)}, X_{t}^{(2)} ⟩ = \int_{0}^{t} ρ σ^{(1)} σ^{(2)} d s + s \leq t \sum Δ X_{s}^{(1)} Δ X_{s}^{(2)}

C_{t}^{12} = \int_{0}^{t} ρ σ^{(1)} σ^{(2)} d s .

C_{t}^{12} = \int_{0}^{t} ρ σ^{(1)} σ^{(2)} d s .

\int_{R^{2}} (1 \land ∣ x_{1} x_{2} ∣^{r /2}) F (d x_{1}, d x_{2})

\int_{R^{2}} (1 \land ∣ x_{1} x_{2} ∣^{r /2}) F (d x_{1}, d x_{2})

= \int_{R^{2}} (1 \land ∣∣ x ∣ ∣^{r}) F (d x) .

r^{*} = in f {r \in [0, 2) : \int_{R^{2}} (1 \land ∣∣ x ∣ ∣^{r}) F (d x) < \infty} .

r^{*} = in f {r \in [0, 2) : \int_{R^{2}} (1 \land ∣∣ x ∣ ∣^{r}) F (d x) < \infty} .

\int_{R^{2}} (1 \land ∣∣ x ∣ ∣^{r}) F (d x)

\int_{R^{2}} (1 \land ∣∣ x ∣ ∣^{r}) F (d x)

= \int_{R} (1 \land ∣ x_{1} ∣^{r}) F_{1} (d x_{1})

+ \int_{R} (1 \land ∣ x_{2} ∣^{r}) F_{2} (d x_{2}) < \infty,

\int_{R^{2}} (1 \land ∣ x_{1} x_{2} ∣^{r /2}) (\mathds 1_{{x_{1} = 0}} + \mathds 1_{{x_{2} = 0}}) F (d x_{1}, d x_{2}) = 0,

\int_{R^{2}} (1 \land ∣ x_{1} x_{2} ∣^{r /2}) (\mathds 1_{{x_{1} = 0}} + \mathds 1_{{x_{2} = 0}}) F (d x_{1}, d x_{2}) = 0,

\int_{B_{1} (0)} ∣ x_{1} x_{2} ∣^{r /2} F (d x_{1}, d x_{2}) = \int_{B_{1} (0)} ∣ x_{1} x_{2} ∣^{r /2} F_{1} (d x_{1}) F_{2} (d x_{2}) \leq \int_{- 1}^{1} (\int_{- 1}^{1} ∣ x_{1} ∣^{r /2} F_{1} (d x_{1})) ∣ x_{2} ∣^{r /2} F_{2} (d x_{2}) < \infty

\int_{B_{1} (0)} ∣ x_{1} x_{2} ∣^{r /2} F (d x_{1}, d x_{2}) = \int_{B_{1} (0)} ∣ x_{1} x_{2} ∣^{r /2} F_{1} (d x_{1}) F_{2} (d x_{2}) \leq \int_{- 1}^{1} (\int_{- 1}^{1} ∣ x_{1} ∣^{r /2} F_{1} (d x_{1})) ∣ x_{2} ∣^{r /2} F_{2} (d x_{2}) < \infty

∥ C ∥_{\infty} + \int_{R^{2}} (1 \land ∣ x_{1} x_{2} ∣^{r /2}) F (d x_{1}, d x_{2}) \leq M .

∥ C ∥_{\infty} + \int_{R^{2}} (1 \land ∣ x_{1} x_{2} ∣^{r /2}) F (d x_{1}, d x_{2}) \leq M .

\displaystyle\phi_{n}(\textbf{u}_{n})=\exp\bigg{\{}

\displaystyle\phi_{n}(\textbf{u}_{n})=\exp\bigg{\{}

\displaystyle-1-i\left\langle\textbf{u}_{n},\textbf{x}\right\rangle\mathds{1}_{\left\{||\textbf{x}||_{\mathbb{R}^{2}}\leq 1\right\}}\big{)}F(d\textbf{x})\bigg{)}\bigg{\}},

⟨ C u_{n}, u_{n} ⟩ = C^{11} U_{n}^{2} + C^{22} U_{n}^{2} + 2 C^{12} U_{n}^{2}

⟨ C u_{n}, u_{n} ⟩ = C^{11} U_{n}^{2} + C^{22} U_{n}^{2} + 2 C^{12} U_{n}^{2}

⟨ C \tilde{u}_{n}, \tilde{u}_{n} ⟩ = C^{11} U_{n}^{2} + C^{22} U_{n}^{2} - 2 C^{12} U_{n}^{2} .

⟨ C \tilde{u}_{n}, \tilde{u}_{n} ⟩ = C^{11} U_{n}^{2} + C^{22} U_{n}^{2} - 2 C^{12} U_{n}^{2} .

C^{12} = \frac{⟨ C u _{n} , u _{n} ⟩ - ⟨ C u ~ _{n} , u ~ _{n} ⟩}{4 U _{n}^{2}} .

C^{12} = \frac{⟨ C u _{n} , u _{n} ⟩ - ⟨ C u ~ _{n} , u ~ _{n} ⟩}{4 U _{n}^{2}} .

\widehat{\phi}_{n}(\textbf{u}_{n})=\frac{1}{n}\sum^{n}_{j=1}e^{i\left\langle\textbf{u}_{n},\Delta^{n}_{j}\textbf{X}\right\rangle}\qquad\mbox{$\textbf{u}_{n}\in\mathbb{R}^{2}$}.

\widehat{\phi}_{n}(\textbf{u}_{n})=\frac{1}{n}\sum^{n}_{j=1}e^{i\left\langle\textbf{u}_{n},\Delta^{n}_{j}\textbf{X}\right\rangle}\qquad\mbox{$\textbf{u}_{n}\in\mathbb{R}^{2}$}.

C_{n}^{12} (U_{n}) = \frac{n}{2 U _{n}^{2}} (lo g ∣ \hat{ϕ}_{n} (\tilde{u}_{n}) ∣ \mathds 1_{{\hat{ϕ}_{n} (\tilde{u}_{n}) \neq = 0}} - lo g ∣ \hat{ϕ}_{n} (u_{n}) ∣ \mathds 1_{{\hat{ϕ}_{n} (u_{n}) \neq = 0}}) .

C_{n}^{12} (U_{n}) = \frac{n}{2 U _{n}^{2}} (lo g ∣ \hat{ϕ}_{n} (\tilde{u}_{n}) ∣ \mathds 1_{{\hat{ϕ}_{n} (\tilde{u}_{n}) \neq = 0}} - lo g ∣ \hat{ϕ}_{n} (u_{n}) ∣ \mathds 1_{{\hat{ϕ}_{n} (u_{n}) \neq = 0}}) .

U_{n}=\begin{cases}\sqrt{n}\quad&\mbox{if $r\leq 1$}\\ \sqrt{(r-1)n\log n}/\sqrt{M}\quad&\mbox{if $r>1$}\end{cases}

U_{n}=\begin{cases}\sqrt{n}\quad&\mbox{if $r\leq 1$}\\ \sqrt{(r-1)n\log n}/\sqrt{M}\quad&\mbox{if $r>1$}\end{cases}

w_{n}=\begin{cases}1/\sqrt{n}\quad&\mbox{if $r\leq 1$}\\ (n\log n)^{\frac{r-2}{2}}\quad&\mbox{if $r>1$}.\end{cases}

w_{n}=\begin{cases}1/\sqrt{n}\quad&\mbox{if $r\leq 1$}\\ (n\log n)^{\frac{r-2}{2}}\quad&\mbox{if $r>1$}.\end{cases}

n \to \infty lim inf C_{n}^{12} in f C^{12} \in L_{M}^{r} sup P [d (C_{n}^{12}, C^{12}) > A w_{n}] \geq B > 0,

n \to \infty lim inf C_{n}^{12} in f C^{12} \in L_{M}^{r} sup P [d (C_{n}^{12}, C^{12}) > A w_{n}] \geq B > 0,

U(x_{1},x_{2})=F_{\gamma}\big{(}[x_{1},+\infty)\times[x_{2},+\infty)\big{)}=C_{\gamma}\big{(}U_{1}(x_{1}),U_{2}(x_{2})\big{)},\quad\mbox{ $x_{1},x_{2}\in[0,\infty]$}

U(x_{1},x_{2})=F_{\gamma}\big{(}[x_{1},+\infty)\times[x_{2},+\infty)\big{)}=C_{\gamma}\big{(}U_{1}(x_{1}),U_{2}(x_{2})\big{)},\quad\mbox{ $x_{1},x_{2}\in[0,\infty]$}

C_{γ} (u_{1}, u_{2}) = γ C_{⊥} (u_{1}, u_{2}) + (1 - γ) C_{∥} (u_{1}, u_{2}),

C_{γ} (u_{1}, u_{2}) = γ C_{⊥} (u_{1}, u_{2}) + (1 - γ) C_{∥} (u_{1}, u_{2}),

r > \frac{2 r _{1} r _{2}}{r _{1} + r _{2}} \geq r_{1} \land r_{2}

r > \frac{2 r _{1} r _{2}}{r _{1} + r _{2}} \geq r_{1} \land r_{2}

F^{(i)} (d x_{i}) = c_{i} x_{i}^{- 1 - r_{i}} \mathds 1 (x_{i} > 0) d x_{i}

F^{(i)} (d x_{i}) = c_{i} x_{i}^{- 1 - r_{i}} \mathds 1 (x_{i} > 0) d x_{i}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Minimax rates for the covariance estimation of multi-dimensional Lévy processes with high-frequency data

Katerina Papagiannoulilabel=e1][email protected] [ Humboldt-Universität zu Berlin

Institut für Mathematik

Humboldt-Universität zu Berlin

Unter den Linden 6

10099 Berlin

Germany

Abstract

This article studies nonparametric methods to estimate the co-integrated volatility for multi-dimensional Lévy processes with high frequency data. We construct a spectral estimator for the co-integrated volatility and prove minimax rates for an appropriate bounded nonparametric class of Lévy processes. Given $n$ observations of increments over intervals of length $1/n$ , the rates of convergence are $1/\sqrt{n}$ if $r\leq 1$ and $(n\log n)^{(r-2)/2}$ if $r>1$ , where $r$ is the co-jump index activity and corresponds to the intensity of dependent jumps. These rates are optimal in a minimax sense. We bound the co-jump index activity from below with the harmonic mean. Finally, we assess the efficiency of our estimator by comparing it with estimators in the existing literature.

60G51,

62G05,

62G10,

62C20,

60J75,

co-jumps,

infinite variation,

co-integrated volatility,

high-frequency data,

keywords:

[class=MSC]

keywords:

\startlocaldefs

1 Introduction

Lévy processes are the main building blocks for stochastic continuous-time jump models. Whenever the modeling of a stochastic process in finance requires the inclusion of jumps, Lévy processes are those to be considered. They play an instrumental role, for example, in the modeling of financial data, see Carr et al. (2002); Barndorff-Nielsen and Shephard (2004, 2006); Wu (2007); Eberlein and Papapantoleon (2005); Geman (2002).

Consequently, the large amount of applications has given rise to a great demand for statistical methods in the study of Lévy processes, especially nonparametric methods. Using nonparametric methods relaxes any dependency on the model. The problem of estimating the characteristics of a Lévy process has received considerable attention over the past decade. Starting with the work by Belomestny and Reiß (2006), a number of articles have considered nonparametric estimation methods for Lévy processes. Therefore, one important task is to provide estimation methods for the characteristics of a Lévy process.

Moreover, statistical methods require the nature of the observation schemes to be classified as high frequency or low frequency; here, we focus on a high frequency setting. If we can assume high-frequency observations for a Lévy process, we can discretize a natural estimator based on continuous-time observations, where the jumps and the diffusion part are observed directly. In recent years, the literature on this subject has grown extensively, see Figueroa-Lopez and Houdré (2004); Todorov and Tauchen (2011); Coca (2018); Comte and Genon-Catalot (2009); Neumann and Reiß (2009). We now have vast amounts of data on the prices of various assets, exchange rates, and so on, typically tick data which are recorded at every transaction time.

Much has been written on the estimation of Lévy density using nonparametric techniques, for instance Nickl et al. (2016), Duval and Mariucci (2017), Comte and Genon-Catalot (2014) and the references therein. However, we are interested in the estimation of the continuous part of a Lévy process, although jumps still play a central role in this estimation. In the univariate context, the seminal work of Andersen and Bollerslev (1998) and Barndorff-Nielsen and Shephard (2002), proposed realized variance as an estimator for quadratic variation. In the presence of jumps, a well-known theoretical result proves that the realized variation converges in probability to the global quadratic variation as the time between two consecutive observations tends towards zero. This result motivated estimators that filter out the jumps, like Bipower Variation by Barndorff-Nielsen and Shephard (2004) and Truncated Realized Variation by Mancini (2009).

In the multivariate context, the recovery of co-integrated volatility (also known as covariance) becomes more complicated. Among various prominent works see Christensen et al. (2013), Bibinger and Vetter (2015), Bibinger and Winkelmann (2015). For models incorporating jumps, the realized covariation converges in probability to the global quadratic variation containing the co-jumps. Co-jumps refer to the case when the underlying processes jump at the same time with the same direction. This raises the question how we can assess the dependent structure among the jump components. We find the answer in the Lévy copula, a subject studied by Tankov (2004) in his PhD thesis. The interested reader should refer to Tankov and Cont (2003) and Kallsen and Tankov (2006). The Lévy Copula is the basic tool for the class of multidimensional Lévy processes. To mention only the few approaches which are close to our focus on Lévy processes we refer to Mancini (2017), Christensen et al. (2013), Martin and Vetter (2017), Bibinger et al. (2014), Jacod and Reiß (2014), Bücher and Vetter (2013) and Belomestny and Trabs (2018).

Our aim in the present work is to provide minimax rates of convergence for the estimation of co-integrated volatility when the underlying process belongs to a certain class of multi-dimensional Lévy processes. Many features of co-integrated volatility have already been studied, such as asynchronous observations, microstructure noise, and allowing for dependency among the jumps components. Whereas most of the aforementioned results prove central limit theorems for their estimators, at least to the best of our knowledge no work has dealt with optimal rates of convergence in the minimax sense. This work serves to fill this gap. Jacod and Reiß (2014) proposed a spectral estimator for integrated volatility achieving minimax rates. In the present work, we generalize their work on finite dimensions. By virtue of simplicity, we will concentrate primarily on a two-dimensional regime, but extensions to the general multi-dimensional setting are straightforward to obtain as well.

For this purpose let us define, for a two-dimensional Lévy process X, the Blumenthal- Getoor index $r^{*}$ :

[TABLE]

where $\textbf{x}=(x_{1},x_{2})\in\mathbb{R}^{2}$ is the size of the jump components, $\|\cdot\|$ is the Euclidean norm in $\mathbb{R}^{2}$ , $F$ is the Lévy measure. $B(r)$ is not specifically interesting but the $BG$ - index gives us the infimum number $r$ for which $B(r)$ is finite. This index is a very important number for the Lévy processes, because using this index we can infer about the behavior of small jump components around [math]. When we have a two-dimensional Lévy process we have either independent jumps (i.e. disjoint or jumps in the axes) or dependent (i.e. co-jumps or joint jumps). In the present work we focus on the case of co-jumps, when the two marginals jump at the same time in the same direction. So the index $r^{*}$ gives us information about the amount of disjoint and joint small jumps around [math]. The behavior of co-jumps around [math], is described by

[TABLE]

Here, we are interested in investigating the optimal rates for the estimation of co-integrated volatility when the model falls in a class of two-dimensional Lévy processes, in case the jump components are either of finite or infinite variation and satisfy (1.2). Let $X=(X^{(1)},X^{(2)})$ and $r_{1}$ , $r_{2}$ be the index of jump activity for the small jump components of each process $X^{(1)},X^{(2)}$ respectively. We find that $r$ , the index activity of co-jumps, is bounded from below by the harmonic mean of $r_{1},r_{2}$ , even in the case of infinite variation jumps. This was not known up to now. Under this assumption for co-jumps we show that our spectral estimate for co-integrated volatility converges at a rate $(n\log n)^{\frac{(r-2)}{2}}$ if $r>1$ and $\frac{1}{\sqrt{n}}$ if $r\leq 1$ .

Assuming a 2-dimensional Itô semimartingale, Mancini (2017) proposed a truncated covariance estimator to estimate co-integrated volatility at the rate $\frac{1}{\sqrt{n}}$ when $r_{1}$ is small and $r_{2}$ is close to 1, $n^{-\frac{1}{2}\big{(}1+\frac{r_{2}}{r_{1}}-r_{2}\big{)}}$ when $r_{1},r_{2}$ is much bigger than $1$ and close to $2$ , $n^{\big{(}\frac{r_{2}}{2}-1\big{)}}$ when $r_{1}$ is small and $r_{2}$ is much bigger than $r_{1}$ or in case of independent small jump components. However, these rates are sub-optimal for the class which we described in the last paragraph .

Let us describe the outline of this paper. In Section 2 we state the underlying model. In Section 3 we give the assumptions to be satisfied in order to prove the minimax rates. In Section 4 we construct our spectral estimator and state the results of this work. Section 5 gives the insight behind the co-jump index activity. In Section 6 we prove the upper bound for the family of our estimators. In Section 7 we present the proof of lower bound in a minimax sense. We provide some comparison of our estimator with existent estimators in the literature in Section 8. In the last section we provide a simulation study.

2 The underlying model

We assume equidistant discrete observations with the consecutive time between two observations being $i\Delta_{n},i=0,\cdots,n$ for a mesh $\Delta_{n}\to 0$ . Here, we use as a mesh $\Delta_{n}=\frac{1}{n}$ and $n\to\infty$ . Regarding the time horizon of the process, it is observed on a finite time span $[0,1]$ . Let $\textbf{X}=(X^{(1)},X^{(2)})^{\top}$ be a two-dimensional Lévy process with Lévy-Itô decomposition as

[TABLE]

Unless stated otherwise, from now on b is a drift vector in $\mathbb{R}^{2}$ , $\textbf{W}=(W^{(1)},W^{(2)})$ denotes a bivariate Brownian motion with covariance matrix $\Sigma\Sigma^{\top}$ , and $\mu,\tilde{\mu}$ are the jump measure and its compensator, respectively. The compensator takes the form $\tilde{\mu}=dsF(d\textbf{x})$ , where $F$ is the Lévy measure of X.

Due to the independence of the continuous part and the discontinuous (jump part) of a Lévy process, the analysis of X canonically splits into the inference on the covariance matrix and the inference on the jump measure $F$ . Our focus on this paper is to investigate an estimator for the co-integrated volatility of X.

We assume a filtered space $(\Omega,\mathcal{F},\left(\mathcal{F}_{t}\right)_{t\geq 0},\mathbb{P})$ supporting two independent standard Brownian motions $W^{(1)},W^{(3)}$ and two Poisson random measures $\mu^{(j)}$ for $j=1,2$ on $\mathbb{R}^{2}\times[0,1]$ . Recall that $W^{(1)},W^{(2)}$ are correlated with $d\langle W^{(1)},W^{(2)}\rangle_{t}\\ =\rho dt$ , where $\rho$ is a constant on $[0,1]$ . We construct $W^{(2)}$ as a linear combination of the two independent Brownian motions so $W^{(2)}_{t}=\rho W^{(1)}_{t}+\sqrt{1-\rho^{2}}W^{(3)}_{t}$ . Next we calculate the variances and covariance of $W^{(1)},W^{(2)}$ , we see that the following holds

[TABLE]

For the covariance we obtain

[TABLE]

the last equality holds because of $W^{(1)},W^{(3)}$ being independent. So, without loss of generality we assume that

[TABLE]

where $\sigma^{(i)},i=1,2$ are deterministic. Therefore, the global quadratic variation of X is given by:

[TABLE]

where the first term is the co-integrated volatility and the second term is the sum of products of simultaneous jumps (called co-jumps). Our target of inference, the co-integrated volatility at time $0\leq t\leq 1$ , is

[TABLE]

3 Assumptions

To derive an estimator for co-integrated volatility and then prove minimax bound for this estimator, we need to establish some assumptions regarding the behavior of small jumps and the class of our estimator. In particular, our setup is intrinsically nonparametric and related to the properties of the observed path. We use the following notation for a matrix: $\|\cdot\|_{\infty}$ is the maximum absolute row sum of the matrix, (i.e. the $\infty$ -norm).

*Assumption (1-M)**.*

The $\infty$ -norm of the covariance matrix is assumed to be bounded, i.e. $\|\Sigma\Sigma^{\top}\|_{\infty}\leq M$ .

*Assumption (2-M)**.*

$\int_{\mathbb{R}^{2}}\left(1\wedge|x_{1}x_{2}|^{r/2}\right)F(dx_{1},dx_{2})\leq M$ , where $r\in[0,2)$ .

Notice that Assumption Assumption (2-M) follows from the classical condition to control the activity of small jumps in two dimensions. Through a trivial calculation

[TABLE]

By using this unconventional Assumption Assumption (2-M), we relax the classical condition for small jumps in two dimensions and make our results stronger, since we consider the case of dependent jumps.

Assumption Assumption (2-M) concerns the behavior of jump components with size smaller or equal to one. By this assumption we consider the problem of controlling the activity of co-jumps, i.e. joint jumps. Below, a co-jump, say at time $t$ , means that both components jump at this time but their jump sizes may not be the same. Ultimately, we are asking that is to say if the small jump components are of finite or infinite variation. This question concerns the behavior of the compensator $F$ , the Lévy measure, near 0. The major difficulties here come form the possibly erratic behavior of $F$ near [math] and the possible dependence between the jump components. In Section 5 we describe more detailed the dependence structure of the jump components and the co-jump index activity $r$ .

The Blumenthal-Getoor (BG) index allows us to classify the processes from least active to most active, according to the above description. We denote by $r^{*}$ the BG index for a two-dimensional Lévy process which satisfies:

[TABLE]

Note that a stable Lévy process of index $\beta\in(0,2)$ satisfies $\int_{\mathbb{R}^{2}}\left(1\wedge||\textbf{x}||^{r}\right)F(d\textbf{x})<\infty$ for all $r>\beta$ , but not for $r\leq\beta$ . The BG index of a $\beta$ - stable is exactly $\beta$ .

The problem of BG index estimation from discrete observations of a Lévy process has drawn much attention in the literature. In the case of high-frequency data, Aït-Sahalia and Jacod (2011) studied the problem of estimating the jump activity index that is defined for any Itô semimartingale. A consistent estimator for the BG index based on one-dimensional Lévy processes with low-frequency data was obtained in Belomestny (2010). The interested reader should refer to Belomestny and Reiß (2015), Section 7 for a detailed review of these results. An extension to time-changed Lévy processes can be found in Belomestny and Panov (2013a, b).

Now we will test the Assumption Assumption (2-M) about the boundedness of small co-jumps with some trivial examples. Despite its simple nature, the following example offers significant insight and intuitive understanding into co-jumps with infinite variation.

Example 3.1.

Suppose we have independent jumps in the coordinates and $F\left(d\textbf{x}\right)$ is a Lévy measure on $\mathbb{R}^{2}$ and x is a vector in $\mathbb{R}^{2}$ . Then, $supp(F)\subseteq\left\{\mathbb{R}\times\left\{0\right\}\cup\left\{0\right\}\times\mathbb{R}\right\}$ which means that

[TABLE]

if the marginals of a two-dimensional Lévy process are finite in the one dimensional case, see the assumption section in Jacod and Reiß (2014).

In this example, we notice that

[TABLE]

since the integrand is always equal to zero. This means that the deterministic error of our estimator is equal to zero, see Section 6.2 for further details. This example shows us something more: Whenever we have independent jumps, no matter the choice of $F$ , we can always find a control for the activity of small jumps. Even if we have jumps of infinite variation.

Example 3.2.

Suppose we have independent jump size distribution, which means that $F(dx)=F_{1}(dx_{1})F_{2}(dx_{2})$ , so

[TABLE]

if and only if $\int^{1}_{-1}|x_{i}|^{r/2}F_{i}(dx_{i})<\infty$ for $i=1,2$ and the Assumption Assumption (2-M) holds.

4 Theoretical results

We use standard notation for asymptotic quantities like $X_{n}=O_{\mathbb{P}}(w_{n})$ if $(X_{n}/w_{n})_{n\geq 1}$ is stochastically bounded (i.e. bounded in probability or tight). We are in a nonparametric setting in which the process X belongs to the class $\mathcal{L}^{r}_{M}$ . Let us now define this class.

Definition 4.1.

For $M>0$ and $r\in[0,2)$ , we define the class $\mathcal{L}^{r}_{M}$ , the set of all Lévy processes, satisfying

[TABLE]

We adapt an estimator proposed by Jacod and Reiß (2014). Specifically, we let X be a two-dimensional Lévy process with characteristic triplet $(\textbf{b},C,F)$ . Let us remember that we are in a high-frequency setting and the consecutive time between two observations is $\frac{1}{n}$ . The characteristic function of $\textbf{X}_{1/n}$ is given by:

[TABLE]

where $C=\left(\begin{smallmatrix}C^{11}&C^{12}\\ C^{21}&C^{22}\end{smallmatrix}\right)$ is the covariance matrix and $\textbf{u}_{n}=(U_{n},U_{n})$ . In the same vein, we define the characteristic function $\phi_{n}(\tilde{\textbf{u}}_{n})$ where $\tilde{\textbf{u}}_{n}=(U_{n},-U_{n})$ . Here, we focus on estimating the characteristic function on the diagonal of first and fourth quadrant for sake of simplicity to our calculations. The results still hold even when we move away from the diagonal. Following a trivial calculation we get that

[TABLE]

So, the covariance is given by

[TABLE]

We consider, based on the observations, the empirical characteristic function of the increments, at each stage n:

[TABLE]

Similarly, we consider the empirical characteristic function $\widehat{\phi}_{n}(\tilde{\textbf{u}}_{n})$ . Based on the trivial calculation (4.3), we now define the spectral estimator

[TABLE]

The first result of this paper is the following theorem.

Theorem 4.2.

Let X belong to the class $\mathcal{L}^{r}_{M}$ . Assume $M>0$ and $r\in[0,2)$ , then as $n\to\infty$ the family of estimators $\widehat{C}_{n}^{12}(U_{n})$ with

[TABLE]

satisfies $|\widehat{C}_{n}^{12}(U_{n})-C^{12}|=O_{\mathbb{P}}(w_{n})$ within the class $\mathcal{L}^{r}_{M}$ where

[TABLE]

Particularly, we have that the family of estimators $\widehat{C}_{n}^{12}$ is consistent with the theoretical co-integrated volatility $C^{12}$ with the exact rates of convergence $w_{n}$ .

Theorem 4.2 gives us an upper bound for the family of our estimators $\widehat{C}_{12}^{n}$ . In Section 6, we give a proof of the upper bound for the family of our estimators $\widehat{C}_{12}^{n}$ . Let us finally show that on the class $\mathcal{L}^{r}_{M}$ the rate $w_{n}$ (4.6) can be achieved exactly and thus constitutes the exact minimax optimal rate.

Theorem 4.3.

Let X belong to $\mathcal{L}^{r}_{M}$ , $r\in[0,2)$ and $M>0$ . Then there are constants $A,B>0$ such that

[TABLE]

where $\widehat{C}_{n}^{12}$ is any estimator for the co-integrated volatility and $d$ is the euclidean distance on $\mathbb{R}^{2}$ .

Theorem 4.3 gives us a lower bound for the family of our estimators $\widehat{C}_{n}^{12}$ within the class $\mathcal{L}^{r}_{M}$ . The rates $w_{n}$ (4.6) for estimating $C^{12}$ , namely the co-integrated volatility at time $t=1$ are optimal in a minimax sense. In Section 7 we prove this result.

5 Co-jump index activity

We are interested in bounding from below the co-jump activity in the case that at least one of the jump components is of infinite variation. Each component $X^{(i)}$ of a two-dimensional Lévy process has its own index activity $r_{i}$ for $i=1,2$ . In the following we will describe the method for bounding from below the index activity of co-jumps. The BG index of a Lévy process depends only on the Lévy measure $F$ . $r$ is an index taking care of positive and negative jumps, for simplicity’s sake but without loss of generality we develop our method for the case in which the Lévy measure is one-sided, i.e. $X^{(i)}$ only makes positive jumps. Thus, $r$ will be influenced by the dependent structure between the jump components.

We will use a Lévy copula to describe this dependency. The concept of Lévy copula allows us to characterize in a time-independent scheme the dependence structure of the pure jump part of a Lévy process. Here, we use the Lévy copula, which permits a range from a dependent to a total independent framework. For the definition and concepts of independence and total positive dependence copula we refer to Kallsen and Tankov (2006). The next definition is taken from Mancini (2017).

Definition 5.1.

The occurrence of joint jumps in $(X^{(1)},X^{(2)})$ is described by the following tail integrals

[TABLE]

where $C_{\gamma}:[0,\infty]^{2}\to[0,\infty]$ is a Lévy copula of the form

[TABLE]

where $C_{\bot}=u_{2}\mathds{1}(u_{1}=\infty)+u_{1}\mathds{1}(u_{2}=\infty)$ is the independence copula, $C_{\parallel}(u_{1},u_{1})=u_{1}\wedge u_{2}$ is the total positive dependence copula and $\gamma$ varies in $[0,1]$ .

The following remark gives us some clarifications on the definition of the above Lévy copula.

*Remark 5.2**.*

The marginal tails $U_{i}$ are defined on $[0,\infty]$ , the joint tail is defined on $[0,\infty]^{2}$ . $u_{1},u_{2}$ stands for $U_{1}(x_{1}),U_{2}(x_{2})$ , and $(u_{1},u_{2})=(+\infty,+\infty)$ is allowed: both $U_{i}(x_{i})$ could be $\infty$ , namely when both $x_{i}=0$ . In that case $U(x_{1},x_{2})=C_{\gamma}(U_{1}(x_{1}),U_{2}(x_{2}))$ is $+\infty$ , and $C_{\gamma}(\infty,\infty)=0$ . $C_{\gamma}$ is a Lévy copula because it is a convex combination of two Lévy copulas, i.e. $C_{\gamma}$ is a $2-$ increasing, grounded and with uniform margins, because $C_{\bot}$ and $C_{\parallel}$ are such. $C_{\gamma}(u_{1},u_{2})$ is not a tail integral, it has different properties, for instance $C_{\gamma}(u_{1},+\infty)=u_{1}$ , while for any tail $U$ we have $U(x_{1},+\infty)=0$ Finally, when $\gamma=0$ jump components are totally dependent while when $\gamma=1$ the opposite (ask Jacob).

We observe that the index activity of co-jumps is bounded from below by the harmonic mean.

Lemma 5.3.

Suppose that Assumption Assumption (2-M) holds. Let $X^{(i)}$ be an one-sided $r_{i}$ -stable Lévy process for $i=1,2$ with positive jumps. Given $r_{1},r_{2}\in[0,2)$ assume without loss of generality $r_{1}\leq r_{2}$ and $r_{2}\geq 1$ . We assume either complete dependent or independent jumps. Then, we have that

[TABLE]

where $r$ is the index activity of co-jumps.

Proof.

Each $X^{(i)}$ is following a $r_{i}$ -stable Lévy process with Lévy measure

[TABLE]

for each $i=1,2$ . We assume without loss of generality that $c_{1}\leq c_{2}$ . We denote by

[TABLE]

the tail integral of the marginal Lévy measure $F^{(i)}$ . Note that $r_{i}$ is the BG index of $X^{(i)}$ .

The independent jumps have sizes of either $(x_{1},0)$ or $(0,x_{2})$ . This means that we have jumps only on the Cartesian axes. The independent copula regulates such jumps. On the other hand, the complete dependent jumps are regulated by the dependent copula; their size falls into the point $(x_{1},x_{2})$ . The complete dependent jumps are completely positively monotonic, i.e. there exists a strictly increasing and positive function $f$ such that $\forall t>0$ , $\Delta X_{t}^{(2)}=f(\Delta X_{t}^{(1)})$ . This means that when $x_{1}$ is a jump realisation so there is a realisation $x_{2}$ such as $x_{2}=f(x_{1})$ , then $x_{1}$ is interpreted as the first component of the joint jump. In fact, the sizes $(x_{1},x_{2})$ are supported by the graph $x_{2}=f(x_{1})$ . For the dependent copula we need the minimum between $U_{1}(x_{1})$ and $U_{2}(x_{2})$ , which is attained when $U_{1}(x_{1})=U_{2}(x_{2})$ . Hence, the graph $x_{2}=U_{2}^{-1}(U_{1}(x_{1}))$ supports the joint jumps.

In our case we assume one-sided $r_{i}$ -stable processes, which means that the union graph of the joint jumps is given by $x_{2}=\left(\frac{c_{1}r_{2}}{r_{1}c_{2}}\right)^{-1/r_{2}}\cdot x_{1}^{r_{1}/r_{2}}$ . We denote by $F_{\gamma}$ the Lévy measure in terms of the Lévy copula, using the Definition 5.1. Therefore,

[TABLE]

Observe that $\int 1\wedge(x_{1}x_{2})^{r/2}dC_{\bot}(U_{1}(x_{1}),U_{2}(x_{2}))=0$ , since the independent copula regulates the jumps on the axes. Inserting (5.1) into Assumption Assumption (2-M), it turns out that for $\epsilon$ smaller than $1$ , we get

[TABLE]

The first equality holds because of the fact that the integrand is always equally to zero in case of independent jumps. For sake of simplicity, we assume $\gamma=0$ , i.e. totally dependent jumps. We assume that the jump sizes $(x_{1},x_{2})$ falls into the interval $(0,\epsilon)$ for sufficiently small $\epsilon>0$ . Remember $r_{1}\leq r_{2}$ and $c_{1}\leq c_{2}$ , then we have $U_{1}(\epsilon)\leq U_{2}(\epsilon)$ , which implies $\epsilon\geq U_{2}^{-1}\big{(}U_{1}(\epsilon)\big{)}=f(\epsilon)$ . Since we want to bind $x_{1}\leq\epsilon$ and $x_{2}=f(x_{1})\leq\epsilon$ , this gives us $x_{1}\leq f^{-1}(\epsilon)\wedge\epsilon=\epsilon$ . Hence,

[TABLE]

where $C=\Big{(}\frac{c_{1}r_{2}}{r_{1}c_{2}}\Big{)}^{-\frac{1}{r_{2}}}$ .

In light of the above calculations, in order for the integral in (5.3) not to be divergent we need $\Big{(}\frac{r_{1}}{r_{2}}+1\Big{)}\frac{r}{2}-1-r_{1}>-1$ , which means that $r>\frac{2r_{1}r_{2}}{r_{1}+r_{2}}$ . We observe that $r$ , the index activity of co-jumps, is at least the harmonic mean of the indices $r_{1},r_{2}$ . In addition, $\frac{2r_{1}r_{2}}{r_{1}+r_{2}}\geq r_{1}$ , since we assume $r_{1}\leq r_{2}$ . To conclude, the Blumenthal-Getoor (BG) index of the co-jump activity will be bounded from below by

[TABLE]

The proof now is complete. ∎

We see here that the higher the activity of one jump component, the higher the activity of co-jumps.

Next we proceed to the proof of the upper bound Theorem 4.2 using a spectral estimate for the co-integrated volatility. Given the fact that we know an estimate for the integrated volatility $\widehat{IV}$ , we should consider a straightforward estimate for co-integrated volatility. By polarization, $\widehat{IV}\left(X^{(1)}+X^{(2)}\right)/2-\widehat{IV}\left(X^{(1)}\right)/2-\widehat{IV}\left(X^{(2)}\right)/2,$ is a possible estimator for the co-integrated volatility. However, we refrain from using this estimate because the rates of convergence are slower than following the procedure as in Section 6. Let us illustrate this argument with an example.

Example 5.4.

Let $(X_{t})\equiv(X_{t}^{(1)},X_{t}^{(2)})$ be a Lévy process with characteristic triplet $(0,0,F(d\textbf{x}))$ , i.e., without a Gaussian part. We assume its components are independent $r_{i}-$ stable Lévy processes for $i=1,2$ such that $0\leq r_{1}\leq r_{2}<2$ and $r_{2}\geq 1$ . Using Lemma 4.1 from Kallsen and Tankov (2006) F is supported by the coordinates axes and it can be written as $F(d\textbf{x})=F^{(1)}(dx_{1})+F^{(2)}(dx_{2})$ . The Lévy measures of the components are

[TABLE]

More precisely,

[TABLE]

In order the integrals in the last equality not to be divergent we need $r>r_{2}$ and $r>r_{1}$ . As a consequence, we find that $r>\max(r_{2},r_{1})$ . Using (3.1) we find that the Blumenthal-Getoor index $r^{*}=r_{2}$ .

6 Upper Bound

In this section we prove Theorem 4.2. We say that a sequence of estimators $\widehat{C}_{n}^{12}$ achieves the rate $w_{n}$ on $\mathcal{L}^{r}_{M}$ , for estimating $C_{12}$ , if $|\widehat{C}_{n}^{12}-C_{12}|=O_{\mathbb{P}}(w_{n})$ . This means that the family $\frac{1}{w_{n}}|\widehat{C}_{n}^{12}-C_{12}|$ is tight. Note that the argumentation in line is the bias-variance decomposition.

6.1 The bias-variance decomposition

We start with deriving a bias-variance-type decomposition of the estimation error of the estimator for cointegrated-volatility.

Lemma 6.1.

We have that

[TABLE]

The deterministic error given as

[TABLE]

and the stochastic error as

[TABLE]

Proof.

We set $C_{n}^{12}(U_{n})=\frac{n}{2U^{2}_{n}}\bigg{(}\log|\phi_{n}(\tilde{\textbf{u}}_{n})|-\log|\phi_{n}(\textbf{u}_{n})|\bigg{)}$ , recalling the form of the estimator (4.5). We get

[TABLE]

Inserting (4.5) into (6.3), we get that estimation error is given by $\widehat{C}^{12}_{n}-C_{12}=D_{n}+H_{n}$ . We need both quantities $\widehat{\phi}_{n}(\textbf{u}_{n}),\widehat{\phi}_{n}(\tilde{\textbf{u}}_{n})$ to be different than zero, otherwise the estimation error does not hold. ∎

Our goal is to show that the estimation error is stochastically bounded, i.e. $O_{\mathbb{P}}(w_{n})$ . Firstly, we bound the deterministic error.

6.2 Bounding the deterministic error

Lemma 6.2.

Grant Assumption Assumption (2-M). The deterministic error satisfies $|D_{n}|\leq\frac{M}{2}U^{r-2}_{n}+AU^{-2}_{n}$ , where $A$ is a positive constant.

Proof.

Recall the characteristic function of $\textbf{X}_{1/n}$ in (4.2). We define

[TABLE]

Therefore,

[TABLE]

and $|\phi_{n}(\tilde{\textbf{u}}_{n})|=\exp\left(-\frac{1}{2n}\bigg{(}\left\langle C\tilde{\textbf{u}}_{n},\tilde{\textbf{u}}_{n}\right\rangle+\tilde{d}_{n}\bigg{)}\right)$ . Notice that here we use an argument of complex analysis. After taking the absolute value of the characteristic function, the imaginary part of the exponent is vanishing. Summing up,

[TABLE]

By (6.6), we have

[TABLE]

where we used the fact that $|\cos x-\cos y|\leq 2\wedge|x^{2}-y^{2}|$ . Using the inequality $a\wedge b\leq a^{p}b^{1-p}$ for $p\in(0,1)$ , the last term can be bounded as follows

[TABLE]

here $\frac{r}{2}\in(0,1)$ because $r\in(0,2)$ is the co-jump index activity. By Assumption Assumption (2-M) and for some constant $A>0$ ,

[TABLE]

as required. ∎

6.3 Bounding the stochastic error

We want to investigate how close the empirical characteristic function is to the characteristic function of a two-dimensional Lévy process. The variables $e^{i\left\langle\textbf{u}_{n},\Delta^{n}_{j}\textbf{X}\right\rangle}$ are i.i.d. as $j$ varies, with expectation $\phi_{n}(\textbf{u}_{n})$ . The same statement holds true for $e^{i\left\langle\tilde{\textbf{u}}_{n},\Delta^{n}_{j}\textbf{X}\right\rangle}$ as well. So $\widehat{\phi}_{n}(\textbf{u}_{n})$ is an unbiased estimator because $\mathbb{E}[\widehat{\phi}_{n}(\textbf{u}_{n})]=\phi_{n}(\textbf{u}_{n})$ . Also, the variance of the empirical characteristic function is given by $\textsf{Var}(\widehat{\phi}_{n}(\textbf{u}_{n}))=\frac{1}{n}\left(1-|\phi_{n}(\textbf{u}_{n})|^{2}\right)$ .

Definition 6.3.

For a $\mathbb{C}$ - valued random variable $Z$ we define

[TABLE]

Lemma 6.4.

Let $V_{n}=\widehat{\phi}_{n}(\textbf{u}_{n})-\phi(\textbf{u}_{n})$ and $\tilde{V}_{n}=\widehat{\phi}_{n}(\tilde{\textbf{u}}_{n})-\phi_{n}(\tilde{\textbf{u}}_{n})$ , where $V_{n},\tilde{V}_{n}\in\mathbb{C}$ . Then, $\mathbb{E}(|V_{n}|^{2})\leq\frac{1}{n}$ and $\mathbb{E}(|\tilde{V}_{n}|^{2})\leq\frac{1}{n}$ .

Proof.

Set $V_{n}=Z\in\mathbb{C}$ such that $V_{n}=\widehat{\phi}_{n}(\textbf{u}_{n})-\phi_{n}(\textbf{u}_{n})$ . Remember that $\widehat{\phi}_{n}$ is unbiased due to the fact that $\mathbb{E}[\widehat{\phi}_{n}(\textbf{u}_{n})]=\phi_{n}(\textbf{u}_{n})$ , thus $|\mathbb{E}(Z)|^{2}=0$ . Taking this into consideration with the previous definition (6.3), we obtain that $\mathbb{E}[|Z|^{2}]=\textsf{Var}[Z]$ . Therefore,

[TABLE]

The same argument holds also for $\mathbb{E}(|\tilde{V}_{n}|^{2})$ . This completes the proof. ∎

We choose $\textbf{u}_{n}=\big{(}U_{n},U_{n})$ and $\tilde{\textbf{u}}_{n}=(U_{n},-U_{n})$ . Recall that we estimate the characteristic function on the diagonal of first and fourth quadrant for calculation simplicity. Particularly, we choose for $M>0$ , $r\in[0,2)$ and $n$ large enough

[TABLE]

Lemma 6.5.

Grant Assumption Assumption (1-M). For some positive constants $A,\Gamma,M$ and on the event $\left\{|V_{n}|\leq\frac{1}{n^{r/4}}\right\}$ the stochastic error satisfies:

[TABLE]

Proof.

Recalling the form of stochastic error (6.2), the first quantity we need to bound is:

[TABLE]

The first inequality holds because

[TABLE]

using the Cauchy-Schwarz inequality for $|\langle\textbf{u}_{n},\textbf{x}\rangle|^{2}\leq\|\textbf{u}_{n}\|^{2}\|\textbf{x}\|^{2}$ and the fact that $U_{n}\geq 1$ . The last inequality in (6.12) derives from AssumptionAssumption (1-M) and the fact that we always have $\int_{\mathbb{R}^{2}}(1\wedge\|\textbf{x}\|^{2})F(d\textbf{x})<\infty$ . Next, the form of $U_{n}$ (6.10) implies that

[TABLE]

where $\Gamma=e^{2M}$ . Let us now argue that

[TABLE]

as soon as $n\geq n_{0}=(2\Gamma)^{\frac{4}{(2-r)\wedge r}}$ and by (6.13) on the set $\left\{|V_{n}|\leq\frac{1}{n^{r/4}}\right\}$ and $\left\{|\tilde{V}_{n}|\leq\frac{1}{n^{r/4}}\right\}$ . Therefore, $\left|\frac{V_{n}}{\phi_{n}}\right|\leq\frac{1}{2}$ . Accordingly, for the stochastic error on the events $\left\{|V_{n}|\leq\frac{1}{n^{r/4}}\right\}$ and $\left\{|\tilde{V}_{n}|\leq\frac{1}{n^{r/4}}\right\}$ we obtain for some deterministic constant $A$ :

[TABLE]

In the last inequality, we use the linearized stochastic errors for $\log\left|1+\frac{\tilde{V}_{n}}{\phi_{n}(\tilde{\textbf{u}}_{n})}\right|\thickapprox\left|\frac{\tilde{V}_{n}}{\phi_{n}(\tilde{\textbf{u}}_{n})}\right|$ because of the fact that $\frac{\tilde{V_{n}}}{\phi_{n}(\tilde{u_{n}})}$ and $\frac{V_{n}}{\phi_{n}(u_{n})}$ are small enough. So there is a positive constant $A$ such that $\log\left|1+\frac{\tilde{V}_{n}}{\phi_{n}(\tilde{\textbf{u}}_{n})}\right|\leq A\left|\frac{\tilde{V}_{n}}{\phi_{n}(\tilde{\textbf{u}}_{n})}\right|$ . Therefore,

[TABLE]

Henceforth, for $n\geq n_{0}$ , and for some constant $A>0$ , we have

[TABLE]

The third inequality holds because we applied the Cauchy-Schwarz inequality and by Lemma 6.9. To sum up, by (6.13) we get that

[TABLE]

as required.

∎

*Remark 6.6**.*

Here, we are interested in the events $\left\{|V_{n}|\leq\frac{1}{n^{r/4}}\right\}$ and $\left\{|\tilde{V}_{n}|\leq\frac{1}{n^{r/4}}\right\}$ because the probabilities of the events $\left\{|V_{n}|>\frac{1}{n^{r/4}}\right\}$ and $\left\{|\tilde{V}_{n}|>\frac{1}{n^{r/4}}\right\}$ are negligible. Indeed, applying the Chebyshev inequality and by Lemma 6.9 we get

[TABLE]

which tends towards zero as $n\to\infty$ . Likewise, the probability of the event $\left\{|\tilde{V}_{n}|>\frac{1}{n^{r/4}}\right\}$ tends towards zero as $n\to\infty$ .

Until now we bound from above the deterministic and stochastic errors. We are now ready to prove that the family $\frac{1}{w_{n}}|\widehat{C}_{n}^{12}-C^{12}|$ is tight in $\mathcal{L}^{r}_{M}$ and thus establish an upper bound for our estimator.

End proof of Theorem 4.2

Applying the Markov inequality, we get for every $\epsilon,L>0$

[TABLE]

Further applying Lemmas 6.5 and 6.2, we deduce that, as $n\to\infty$

[TABLE]

for (6.18) smaller than $L$ , proves that the family $\frac{1}{w_{n}}|\widehat{C}_{n}^{12}(U_{n})-C^{12}|$ is tight in $\textbf{X}\in\mathcal{L}^{r}_{M}$ . The proof is complete.

7 Lower Bound

In nonparametric statistics it is common to use a minimax approach in order to prove optimal estimators. In the previous section, we proved Theorem 4.2, which gave us an upper bound for the estimation of co-integrated volatility using a spectral approach and establishing the rates (4.6) on the class $\mathcal{L}^{r}_{M}$ .

In this section, we want to prove Theorem 4.3. The existence of a lower bound on the class $\mathcal{L}^{r}_{M}$ constitutes the exact minimax rates for the estimation of co-integrated volatility. Indeed, we have something more for the lower bound, namely that any estimator on a general class of Itô semimartingales satisfying Definition 4.1 achieves a lower bound with rates (4.6). So far, we do not know whether the spectral approach for the upper bound yields the same optimal rate on the larger class of Itô semimartingale.

We refer to Chapter 2 in Tsybakov (2009) for the techniques to prove the lower bounds. We establish the lower bound following the argumentation in line with a two-hypothesis test. Next, we introduce a distance between probability measures that will be useful for the lower bound.

Definition 7.1.

The total variation distance between $\mathbb{P}_{0},\mathbb{P}_{1}$ is defined as follows:

[TABLE]

where $p_{0}=d\mathbb{P}_{0}/d\nu$ , $p_{1}=d\mathbb{P}_{1}/d\nu$ and $\nu=\mathbb{P}_{0}+\mathbb{P}_{1}$ a $\sigma$ -finite measure.

To sum up, in order to prove a lower bound on the minimax probability of error for hypotheses we use the theorem [2.2] in Tsybakov (2009). The lower bound is obtained when the following two properties are satisfied. First, we choose the appropriate parameters for the co-integrated volatility to be close enough but distinguished. Second, we bound from below the total variation distance between the two densities probabilities of our parameters.

Let us illustrate the above procedure giving a trivial lemma and proving that the above arguments are adequate, so as to obtain the lower bound corresponding to our family of estimators $\mathcal{L}^{r}_{M}$ . The interested reader may refer to Lehmann and Romano (2006) who explore a lot of examples for hypothesis testing and distances between Gaussian random variables.

Lemma 7.2.

(No jumps for a two-dimensional Lévy process). Assume X belongs to the class $\mathcal{L}^{r}_{M}$ with Lévy-Khintchine triplet $(0,\Sigma\Sigma^{\top},0)$ . Then there are constants $A,K$ such that

[TABLE]

where $\widehat{C}_{n}^{12}$ is any estimator for the co- integrated volatility withing the class $\mathcal{L}^{r}_{M}$ , $d$ is the euclidean distance on $\mathbb{R}^{2}$ and $w_{n}=\frac{1}{n}$ .

Proof.

Consider X and Y belongs to $\mathcal{L}^{r}_{M}$ . Also, we assume that no jumps are occurred so the Lévy Khintchine triplets for each process will satisfy $(0,\Sigma_{\textbf{X}}\Sigma_{\textbf{X}}^{\top},0)$ and $(0,\Sigma_{\textbf{Y}}\Sigma_{\textbf{Y}}^{\top},0)$ respectively. As a result, X will evolve as follows:

[TABLE]

similarly for Y. We know that the Itô integral $dX^{(1)}(t)=\sigma^{(1)}_{t}dW^{(1)}_{t}$ is normally distributed with mean [math] and its variance is given by Itô isometry which translates to

[TABLE]

Therefore, X follows the parametric model

[TABLE]

similarly for Y.

We will prove the lower bound using the two-hypothesis test, as mentioned in the beginning of this section. We observe that

[TABLE]

where $\mathcal{B}_{M}$ is the class of all Brownian motions where the covariance matrix is bounded component-wise by a constant $M$ . As a consequence,

[TABLE]

This is enough to prove a lower bound for the rate $w_{n}=\frac{1}{n}$ for the class of all Brownian motions.

The two-hypothesis test is the following

$\mathbb{P}_{\textbf{X}}=\mathcal{N}\left(0,\Sigma_{\textbf{X}}\Sigma_{\textbf{X}}^{\top}\right)$ vs $\mathbb{P}_{\textbf{Y}}=\mathcal{N}\left(0,\Sigma_{\textbf{Y}}\Sigma_{\textbf{Y}}^{\top}\right),$

where the covariance matrices are given by $\Sigma_{\textbf{X}}\Sigma_{\textbf{X}}^{\top}=\left(\begin{smallmatrix}2&1\\ 1&1\end{smallmatrix}\right)$ and $\Sigma_{\textbf{Y}}\Sigma_{\textbf{Y}}^{\top}=\left(\begin{smallmatrix}2&1+\frac{1}{n}\\ 1+\frac{1}{n}&1\end{smallmatrix}\right)$ . Intuitively, we perturb the off-diagonal elements, namely the covariance, by the rate we want to achieve. Following the argumentation of the two-hypothesis test, it is sufficient to prove that the total variation distance is bounded. To do so, we use the Pinsker inequality. By Pinsker inequality we have that

[TABLE]

where $KL(\mathbb{P}_{\textbf{Y}},\mathbb{P}_{\textbf{X}})$ is the Kullback-Leibler divergence. Next, we show that the Kullback-Leibler distance is bounded. We define the Kullback-Leibler divergence between two multivariate normal distributions. Here, we denote by $\Sigma_{1}=\Sigma_{\textbf{X}}\Sigma_{\textbf{X}}^{\top}$ and $\Sigma_{2}=\Sigma_{\textbf{Y}}\Sigma_{\textbf{Y}}^{\top}$ . Therefore,

[TABLE]

where $|\cdot|$ denotes the determinant of a matrix. Calculating the appropriate quantities, we obtain

[TABLE]

Therefore,

[TABLE]

Consequently, we obtain that the right hand side tends to zero as $n\to\infty$ . By Pinsker inequality, the total variation distance tends to zero. Upon using the minimax probability of error is bounded from below by $1/2$ and the claim follows. ∎

To prove Theorem 4.3 we need to construct the two-hypothesis test in order to bound from below the minimax probability error as we described previous in Lemma 7.2.

7.1 Two-hypothesis test

We let X, Y be two-dimensional Lévy processes with respective triplets $(0,\Sigma_{\textbf{X}}\Sigma_{\textbf{X}}^{\top},F_{n})$ , $(0,\Sigma_{\textbf{Y}}\Sigma_{\textbf{Y}}^{\top},G_{n})$ , where $F_{n}$ , $G_{n}$ are Lévy measures in $\mathbb{R}^{2}$ satisfying

[TABLE]

where $\textbf{x}=(x_{1},x_{2})$ is a vector in $\mathbb{R}^{2}$ representing the size of small jumps for each process and $M$ is a constant (below $M$ changes from line to line and may depend on r, but all constants are denoted as $M$ ). We set $\Sigma_{\textbf{X}}\Sigma_{\textbf{X}}^{\top}=\left(\begin{smallmatrix}2&1\\ 1&1\end{smallmatrix}\right)$ and $\Sigma_{\textbf{Y}}\Sigma_{\textbf{Y}}^{\top}=\left(\begin{smallmatrix}2&1+2w_{n}\\ 1+2w_{n}&1\end{smallmatrix}\right)$ to be the parameters of our two-hypothesis test. Under this setting, we perturb the off-diagonal elements with the rate with which we want to achieve the lower bound. The quantity which we want to recover is the co-integrated volatility, so we need the off-diagonal elements. We use these forms of matrices in order for the Gaussian part to be non-degenerated, namely the eigenvalues of the matrices to be positive. As we discussed in the beginning of this section, it is sufficient to construct two sequences $\textbf{X}_{n}$ , $\textbf{Y}_{n}$ which belong to the class $\mathcal{L}^{r}_{M}$ , with the following two properties:

Property 1.

The two parameters, namely the two covariance matrices are close enough but distinguished.

Note that for this property the object of our study is the distance between matrices. In this case we consider as a distance the Frobenius norm, and everything still holds. By construction and Frobenius norm

[TABLE]

which means that the parameters are close enough but distinguished.

Property 2.

The total variation distance between $\mathbb{P}_{\textbf{X}}$ and $\mathbb{P}_{\textbf{Y}}$ tends towards zero.

As far as the second property is concerned, the total variation distance tends towards zero is not trivial. In fact, achieving the second property is quite demanding and we prove several lemmas to conclude this property.

7.2 Construction of the co-jump measure in $\mathbb{R}^{2}$

First, we have to construct a measure to satisfy property (7.1). Before we proceed with the technical part of this construction, let us highlight the idea behind it.

Note that we are studying a two-dimensional Lévy process, so it is reasonable to include the possibility of dependence between the two jump components, more specifically the common jumps, i.e. the co-jumps.

Observe here that co-jumps are one-dimensional objects. Co-jumps are the jumps on the diagonal, due to the fact that the two processes jump at the same time with the same jump size. Mathematically speaking, this can be formalized as follows:

Definition 7.3.

(Co-jump measure) Let $\textbf{X}=\left(X^{(1)},X^{(2)}\right)$ be a Lévy processes, with $\Delta X^{(j)}_{t}\neq 0$ for $j=1,2$ . Here, $\Delta X^{(j)}_{t}=X_{t}^{(j)}-X^{(j)}_{t^{-}}$ denotes the possible jump at time $t$ . The measure on $\mathbb{R}^{2}$ is defined by:

[TABLE]

where $B=\left\{(x_{1},x_{2})\in\mathbb{R}^{2}:x_{1}=x_{2}\right\}$ . This is called the Lévy measure in $\mathbb{R}^{2}$ of co-jumps for X. $F_{n}(B)$ is the expected number of joint jumps, per unite of time, whose size falls into $B$ , and $\mu$ is the Poisson random measure of co-jumps where, $\mu^{X}(\omega;t,B)=\sum_{s\leq t}\mathds{1}_{B}(\Delta X^{(1)}_{t},\Delta X^{(2)}_{t})$ .

Because the jump dynamics of the co-jump measure is dictated by its density, say $f_{n}$ , we can write the measure of the co-jumps as following, for $A\subset B\subset\mathbb{R}^{2}$

[TABLE]

where $\tilde{A}=\{x:(x,x)\in A\}$ .

The support of the co-jump measure is on $\mathbb{R}$ but the co-jump measure lives on $\mathbb{R}^{2}$ . We focus on the case of co-jumps, i.e., when $X^{(1)}$ and $X^{(2)}$ jump at the same time with the same jump size. We are interested in the jumps on the diagonal.

Further, we do not integrate with respect to the Lebesgue measure, since it is equal to zero on the diagonal. In this case we integrate with respect to a measure that is not absolutely continuous with the Lebesgue measure, which we call co-jump measure. We assume that $F_{n},G_{n}$ have densities $f_{n}$ and $g_{n}$ respectively. By (7.2) we want to show that

[TABLE]

without being equal to zero. Being interested in the set of co-jumps, we pass from two dimensions to one dimension. Co-jumps are the concept of total dependency between the small jump components. Indeed, we use the argument of dependency in order to reduce dimensionality. In order to prove this argument, we need the following lemma.

Lemma 7.4.

Let $g:\mathbb{R}^{2}\to\mathbb{R}$ be a measurable function and $F_{n}$ be the co-jump measure on $\mathbb{R}^{2}$ . Then

[TABLE]

where $F_{n}(A)=\int f_{n}(x)\mathds{1}_{A}(x,x)dx$ is the measure of co-jumps, $f_{n}$ the density function of the co-jump measure $F_{n}$ , $A\subset B$ , and $B=\left\{(x_{1},x_{2})\in\mathbb{R}^{2}:x_{1}=x_{2}\right\}$ .

Proof.

First we use the step functions to prove the lemma.This extends by linearity and by taking limits for all measurable functions $g$ . Indeed, we only need to show the lemma for the case of step functions. Let $g(x_{1},x_{2})=\sum_{k=1}^{m}a_{k}\mathds{1}_{A_{k}}(x_{1},x_{2})$ , where $A_{k}\subset A$ and $\cup_{k=1}^{m}A_{k}=A$ . Therefore,

[TABLE]

and the claim follows. ∎

Furthermore, we need to find a measure whose mass is bounded away from the origin but may explode around 0 and integrates $\|x\|^{2}$ . In order to construct the co-jump measure with the above properties, we need to find an appropriate density function for the co-jumps measure $F_{n}(A)$ so as to satisfy the following condition for $r\in(1,2)$ and $\textbf{x}=(x,x)$ :

[TABLE]

Indeed, the following lemma implies condition (7.1) by choosing properly the density function of the co-jumps.

Proposition 7.5.

Let $w_{n}$ be defined by (4.6) and $r\in(0,2)$ . Assume the even functions $h_{n}:\mathbb{R}^{2}\to\mathbb{R}$ such that $h_{n}(\textbf{u})=\tilde{h}_{n}(U)\cdot\tilde{h}_{n}(U)$ , where $\textbf{u}=(U,U)$ ,

[TABLE]

and $U_{n}=2w_{n}^{1/(r-2)}$ . Then, for any $A\in\mathcal{B}(\mathbb{R}^{2})$

[TABLE]

where $F_{n}(A)=\int_{\mathbb{R}}\frac{|H_{n}(x)|}{x^{2}}\mathds{1}_{A}(x,x)dx$ and $H_{n}$ is the Fourier transform of $h_{n}$ .

Proof.

The mathematical tool used for the formation of the density function is the Fourier transform. Intuitively, we use the function $h_{n}$ as a constant inside a fixed interval and which decays exponentially outside this interval. Also, notice that in the exponential we used the power of 3 because we need to differentiate two times, as we shall see later.

Notice that $h_{n}$ has a range on $\mathbb{R}$ , which is why we use the Fourier transform on $\mathbb{R}$ . The pair of Fourier transform takes the following form:

[TABLE]

and the respective first derivatives will have the form

[TABLE]

For a thorough analysis of the Fourier transform the interested reader should refer to Bracewell (1986).

In the next step, the pair of Fourier transform will provide us with a proper and well-defined density function for the co-jump measure. First, we note that the $\mathbb{L}^{2}$ -norm of $h_{n}$ is bounded. Indeed,

[TABLE]

In the last inequality we used the fact that $\int_{0}^{\infty}e^{-2K^{3}}dK$ is bounded by a constant $C$ . In addition, $h_{n}$ is an $\mathbb{L}^{2}$ -function. Applying the Plancherel theorem we deduce that

[TABLE]

Similarly, we get a bound for the first derivative of $H_{n}$

[TABLE]

Moreover, $\|H_{n}\|_{\mathbb{L}^{1}}$ is also bounded

[TABLE]

We get the first inequality because of the Cauchy-Schwarz inequality. By means of (7.7) and (7.8) the $\mathbb{L}^{1}$ -norm of $H_{n}$ is bounded.

At this point we are ready to define the co-jumps measures $F_{n}(A)$ and $G_{n}(A)$ in terms of the Fourier transform $H_{n}(x)$ .

[TABLE]

for any $A\in\mathcal{B}(\mathbb{R}^{2})$ .

These measures satisfy the basic properties of Lévy measures. They are non-negative, integrate $x^{2}$ , and may explode around zero since $H_{n}(0)\to\infty$ .

It remains to prove (7.1) under this argumentation. Based on the above construction and $A=\{(x_{1},x_{2})\in\mathbb{R}^{2}:x_{1}=x_{2}=x\}$ , (7.1) transforms into:

[TABLE]

so we need to show that (7.11) is finite. Next we show how to bound from above $|H_{n}(x)|$ .

[TABLE]

In the first line the second term is equal to zero since it is the integral of the product of an even and an odd function.

[TABLE]

Note that in the second inequality the integral is always bounded from above by a constant $C$ .

On the sets $\left\{|x|\leq\frac{1}{U_{n}}\right\}$ , $\left\{\frac{1}{U_{n}}<|x|\leq 1\right\}$ , $\left\{|x|>1\right\}$ we deduce that

$|x|\leq\frac{1}{U_{n}}\Rightarrow|U_{n}x|\leq 1\Rightarrow|\sin(U_{n}x)|\leq U_{n}x\Rightarrow\frac{|\sin(U_{n}x)|}{|x|}+1\leq U_{n}$ . 2. 2.

$\frac{1}{U_{n}}<|x|\leq 1\Rightarrow\frac{|\sin(U_{n}x)|}{|x|}+1=\frac{|\sin(U_{n}x)|+|x|}{|x|}\leq\frac{2}{|x|}$ . 3. 3.

$|x|>1\Rightarrow\frac{|\sin(U_{n}x)|}{|x|}+1\leq 2$ .

In turn, we get that

[TABLE]

By splitting the integration domain into the sets $\left\{|x|\leq\frac{1}{U_{n}}\right\}$ , $\left\{\frac{1}{U_{n}}<|x|\leq 1\right\}$ , $\left\{|x|>1\right\}$ and recalling that $r\in(1,2)$ , condition (7.1) will take the form:

[TABLE]

In light of the form of $U_{n}$ the last inequality holds. Recall that $U_{n}=2w_{n}^{1/r-2}$ . Therefore, (7.5) is satisfied, which implies condition (7.1), by which the proof is complete. ∎

Till now we constructed the co-jump measure, which satisfies (7.1), and the covariance matrices for the hypothesis test. So the triplets for the hypothesis test are now defined. Next step, we study the characteristic functions of the two processes, which will be useful later on the proof of 2.

7.3 Characteristic functions of $\textbf{X}_{1/n}$ and $\textbf{Y}_{1/n}$

At this point, we study the processes $\textbf{X},\textbf{Y}$ for one observation at the moment $t=\frac{1}{n}$ . We denote by $\psi_{n}(\textbf{u})$ , $\phi_{n}(\textbf{u})$ the characteristic functions of $\textbf{X}_{1/n}$ , $\textbf{Y}_{1/n}$ respectively, and by $\eta_{n}(\textbf{u})=\psi_{n}(\textbf{u})-\phi_{n}(\textbf{u})$ their difference. The characteristic triplet for each process is

[TABLE]

where $\Sigma_{\textbf{X}}\Sigma_{\textbf{X}}^{\top}=\left(\begin{smallmatrix}2&1\\ 1&1\end{smallmatrix}\right)$ and $\Sigma_{\textbf{Y}}\Sigma_{\textbf{Y}}^{\top}=\left(\begin{smallmatrix}2&1+2w_{n}\\ 1+2w_{n}&1\end{smallmatrix}\right)$ . Denote by $C_{\textbf{X}}=\Sigma_{\textbf{X}}\Sigma_{\textbf{X}}^{\top}$ and $C_{\textbf{Y}}=\Sigma_{\textbf{Y}}\Sigma_{\textbf{Y}}^{\top}$ . The characteristic functions will be defined as follows

[TABLE]

and

[TABLE]

We denote by

[TABLE]

and

[TABLE]

because of the form of co-jump measure (7.10) and the fact that $H_{n}$ is an even function, its Fourier transform will be a real function. Also, recall the Fourier transform of the co-jump measure has support on $\mathbb{R}$ and $A$ is a subset of the diagonal. Moreover,

[TABLE]

Next we bound from above (7.17) and (7.18). First, observe that

[TABLE]

since $H_{n}$ is an even function. We consider the following two cases:

[TABLE]

Now, concerning the $\tilde{\phi}_{n}(U)$ we exploit the same arguments as before and by (7.9) we obtain

[TABLE]

7.4 Total variation distance

In order to establish a lower bound for our class with the rates (4.6), the last ingredient to be shown is that the total variation distance between $\mathbb{P}_{\textbf{X}}$ and $\mathbb{P}_{\textbf{Y}}$ goes towards zero, property 2. Mathematically speaking, this formulates as

[TABLE]

As we discussed in the first step, X and Y have a nonvanishing Gaussian part so that the variables $\textbf{X}_{1/n}$ and $\textbf{Y}_{1/n}$ have densities. Here, $f_{1/n}$ and $g_{1/n}$ denote their densities respectively. Also, $k_{n}=f_{1/n}-g_{1/n}$ denotes the difference between the densities. One would be tempted to use the following

[TABLE]

In the second equality we wrote the density function as the Inverse Fourier transform of its characteristic function. But the last integral is infinite. Hence, this procedure is not working for our goal. Since we want to prove that

[TABLE]

we know that the total variation distance between $\mathbb{P}_{\textbf{X}}$ and $\mathbb{P}_{\textbf{Y}}$ is not more than $2n$ times $\int|k_{n}(x)|dx$ . By using the same argument as for the Jacod and Reiß (2014) Theorem 3.1, by Cauchy-Schwarz inequality and Plancherel theorem, we obtain

[TABLE]

where $\partial_{1}\eta_{n}$ is the first derivative of $\eta_{n}(U)$ . In the second inequality we used the Cauchy-Schwarz inequality, and in the last one we used Plancherel identity. By virtue of simplicity, remember that we use the same coordinates for the vector $\textbf{u}=(U,U)$ .

Thus, the only ingredient which remains to be shown is the following lemma in order to satisfy Property 2.

Lemma 7.6.

We show that

[TABLE]

as $n\to\infty$ .

Proof.

First, we study the convergence of $\eta_{n}(U)$ :

[TABLE]

The last inequality holds due to the fact that $1-e^{-x}\leq x$ .

Observe that when $|U|\leq U_{n}$ , $\eta_{n}(U)=0$ because of the constant value of $h_{n}$ inside this interval. Thus the difference of the characteristic functions vanishes for $|U|\leq U_{n}$ because of $2\tilde{\psi}_{n}(U)=w_{n}U^{2}$ by (7.18).

By means of $\tilde{\phi}_{n}(U),\tilde{\psi}_{n}(U)\geq 0$ , we get that $\psi_{n}(U)\leq e^{-\frac{U^{2}}{2n}}$ and $\phi_{n}(U)\leq e^{-\frac{U^{2}}{2n}}$ . Thus,

[TABLE]

We define by

[TABLE]

Using Cauchy-Schwarz inequality we bound the integral (7.27) from above by

[TABLE]

The integrals to the right can be calculated exactly by calculus methods, and recalling $U_{n}=2w_{n}^{1/(r-2)}$ we get

[TABLE]

The first part of the (7.25) follows. Now recall the form of the characteristic functions and their difference $\eta_{n}=\psi_{n}(U)-\phi_{n}(U)$

[TABLE]

Therefore by (7.19), (7.20), and the fact that $\psi_{n}(U)\leq e^{-\frac{U^{2}}{2n}}$ , $\phi_{n}(U)\leq e^{-\frac{U^{2}}{2n}}$ we get that

[TABLE]

Therefore,

[TABLE]

Now, $\frac{1+w_{n}^{\frac{r-1}{r-2}}}{n^{2}}$ and $\frac{1+w_{n}^{\frac{r-1}{r-2}}}{n}$ tend towards zero. Additionally, the integrals on the right side can be bounded again using Cauchy-Schwarz inequality, like integral A. Following these, we can calculate the integrals through basic calculus methods exactly. As a result,

[TABLE]

which also goes to zero as $n\to\infty$ and the proof is completed. ∎

End proof of Theorem 4.3

*Lower bound for the rate $w_{n}=\frac{1}{\sqrt{n}}$ when $r\in(0,2)$ .

*To prove this bound, it is enough to show that it holds on the subclass of all Brownian motions since $\mathcal{B}_{M}\supset\mathcal{L}^{r}_{M}$ . Taken together with Lemma 7.2, this bound is achieved.

Lower bound for the rate $w_{n}=1/(n\log n)^{\frac{2-r}{2}}$ when $r\in(1,2)$ .

The main steps of this proof are to show that Property 1 and Property 2 are satisfied. Now, with reference to Lemma 7.4 for the construction of co-jump measure, Proposition 6.18 and Lemma 7.6 we conclude the proof of Theorem 4.3.

8 Discussion

In this section we make some important remarks concerning the upper bound and the rates of convergence. First, we want to compare the efficiency of our estimator with the work of Mancini (2017) in which she considered at least one jump component of a two-dimensional Itô semimartingale with infinity variation. Mancini (2017) introduced the truncated realized covariance (TRC) as an estimator for co-integrated volatility. The proposed estimator is

[TABLE]

where $r_{h}=h^{2u}$ is the truncation level with $h=1/n$ , $u\in(0,\frac{1}{2})$ and $n\to\infty$ . It is clear that, when $r_{h}\to 0$ , asymptotically all jumps are excluded. It is assumed that the two jump components have an index activity $r_{1}$ , $r_{2}$ where $0\leq r_{1}\leq r_{2}<2$ and $r_{2}\geq 1$ . Notice by recalling Lemma 5.3 that in our case we used the index r for the activity of co-jumps. In the following we use the notation “ $\gg$ ” to assume the index activity is much greater than $1$ and close to $2$ . The truncated estimator achieves the rate $(1-\gamma)\sqrt{r_{h}}^{(1+\frac{r_{2}}{r_{1}}-r_{2})}$ when $r_{1},r_{2}\gg 1$ . This estimator reaches the rate $h\sqrt{r_{h}}^{-\frac{r_{2}}{2}}$ when the two jump components are independent or $r_{2}\gg r_{1}$ and $r_{1}$ is small and the rate $\sqrt{h}$ when $r_{1}$ is small and $r_{2}$ is close to $1$ . The parameter $\gamma$ describes the dependence structure of the two jumps with $\gamma\in[0,1]$ . When $\gamma=0$ we have full dependency between the jump components, while $\gamma=1$ means independence between the jump components. Finally, for a fair comparison with the spectral estimator we assume $\sqrt{r_{h}}$ to be approximately $\frac{1}{\sqrt{n}}$ as truncation level, since $u\in(0,\frac{1}{2})$ . The truncation level is not optimal, but the work of Figueroa-López and Mancini (2017) proposed an optimal way for the truncation level using mean and conditional mean square error for the case of a one-dimensional Itô semimartingale.

The reliability of estimators (spectral, truncated) is summarized and assessed in the following table. For simplicity we take into consideration only the two extreme cases of dependency. In the first two rows we assume $\gamma=0$ , i.e., the dependency setting among the jump components. While in the last row we consider $\gamma=1$ , i.e., the jumps are totally independent. In order to compare the estimators we use Lemma 5.3 and 5.4.

[TABLE]

Here we notice that when we assume a dependence structure among the jump components the spectral estimator achieves faster rates than the truncated estimator, when we assume infinite variation for both jump components. Notwithstanding, the truncated estimator establishes same rates with the spectral estimator when both jump components have index activity close to $1$ . Finally, when we assume either independence between jump components or $r_{1}$ is much smaller than $r_{2}$ then the spectral estimator reaches a faster rate.

8.1 Numerical experiments.

In this section we test our estimates with Monte-Carlo experiments.111The interested reader can view the code at https://github.com/KarinaPapayia/Co-integrated-volatility-multidimensional-Levy-processes This means that we first have to simulate the sample paths of a bivariate Lévy process on $[0,1]$ .

Section 6 of Tankov and Cont (2003) suggested various simulation algorithms for Lévy processes. We extend here Algorithms 6.6, 6.5, 6.3 to a bivariate setting. In addition, we use the generalized shot noise method for series representation of a two-dimensional Lévy process of infinite variation introduced by Rosinski (1990).

We now perform Monte-Carlo tests of our spectral estimate $\widehat{C}_{12}^{N}(U_{N})$ , comparing it to the Truncated Realized Covariance (TRC) estimate $\widehat{IC}_{T}$ of Mancini (2017) for a two-dimensional Itô semimartingale. To provide a balanced comparison, we will draw our observations from a process $X_{t}=B_{t}+J_{t}$ , where $B_{t}$ is a two-dimensional Brownian motion and $J_{t}$ is a two-dimensional jump process. Its jumps are driven by a two-dimensional $r$ -stable process. $X_{t}$ thus models a process with both diffusion and jump components. In each run of our simulation, we will generate $N=1,000$ observations, corresponding to observations taken every $1/1,000$ over a time interval $[0,1]$ .

The estimates $\widehat{C}^{12}_{N}(U_{N})$ and $\widehat{IC}_{T}$ depend on a number of parameters. We begin by considering the covariance matrix $C=\left(\begin{smallmatrix}2&1\\ 1&1\end{smallmatrix}\right)$ for two correlated Brownian motions. In our simulations, the cointegrated volatility of $X_{t}$ is equal to $1$ , and so we may choose the parameters accordingly. In our tests, we found the value $M=4.229$ worked well for bounding from above the jump activity in the case of infinite variation jumps. In the case of $\widehat{IC}_{T}$ we chose $h=1/1,000$ , $u=0.387$ , and as truncation level $r_{h}=\left(\frac{1}{1,000}\right)^{2*0.387}$ . We found that this truncation level cuts the jumps bigger than $\left(\frac{1}{1,000}\right)^{2*0.387}$ , which means that almost all jumps were eliminated.

Figure 1 plots the simulated distributions of the estimates $\widehat{C}_{N}^{12}(U_{N})$ and $\widehat{IC}_{T}$ together with the density of a standard Gaussian distribution, shown as a solid line. We can see that in every choice for $r_{1},r_{2}$ , the estimates are centered around $1$ , which is the expected theoretical cointegrated volatility.

Figure 2 plots the RMSEs of the estimates $\widehat{C}_{N}^{12}(U_{N})$ , $\widehat{IC}_{T}$ against different choices for the index activity of the co-jumps. We study the performance of the estimates under finite, moderate, and infinite activity of co-jumps. We can see that, as $N$ grows, the RMSE of the spectral estimate $\widehat{C}_{N}^{12}(U_{N})$ is getting smaller compared with the truncated estimate. However, we observe that the RMSEs of the truncated estimate $\widehat{IC}_{T}$ are smaller compared with the spectral estimate when $N=1,000$ .

We observe this behavior in Figure 2 for the truncated estimate $\widehat{IC}_{T}$ because of our choice of truncation level, which is not an optimal. While the threshold $r_{h}=\left(\frac{1}{1,000}\right)^{2*0.387}$ works well for $N=1,000$ , it does not work well when the number of observations is bigger, for example when $N=10,000$ .

Figures 3, 4 give violin plots for the spectral estimate $\widehat{C}_{N}^{12}(U_{N})$ under a number of choices for the amount of observations $N$ and the index activity for the co-jumps, whilst Figures 5, 6 show violin plots for the truncated estimate $\widehat{IC}_{T}$ under the same settings. The number of observations varies from $1,000$ to $10,000$ by step $1,000$ .

In Figure 3, we used as an index activity for the jumps $r_{1}=0.5$ , $r_{2}=0.8$ , while in Figure 4 we set $r_{1}=1.2$ and $r_{2}=1.8$ . In the case of $r_{1}=1.2$ and $r_{2}=1.8$ , we see that the estimation for the covariance slightly deviates from the center as $N$ grows. Furthermore, the effect can be expected to disappear as $N$ tends toward infinity.

Figures 5 and 6 show again that the truncated estimate $\widehat{IC}_{T}$ deviates strongly from the center as $N$ grows, an expected effect due to the choice of truncation level. The chosen threshold works well for $N=1,000$ but not when $N$ grows. We expect this effect to disappear once the optimal choice for the threshold $r_{h}$ is established. Finally, $U_{N}$ is the parameter which controls the frequency for our spectral estimate $\widehat{C}_{N}^{12}(U_{N})$ . $U_{N}$ depends on $N,M,r$ . In view of the form (6.10) for $U_{N}$ we can find a constant to multiply which will give us the optimal choice for $U_{N}$ . The results will still hold. In fact 7 shows that the spectral estimate for $M>3.31$ , $N=5,000$ and $r=1.5$ is centered around the theoretical co-integrated volatility $C^{12}$ . Figure 7 shows violin plot for the spectral estimate tuning up the parameter $M$ , which ranges from $3.30$ to $4.40$ .

Figure 8 shows that the truncated estimate $\widehat{IC}_{T}$ is quite sensitive to the choice of threshold. Here, we used $N=5,000$ , $r_{1}=1.2$ , $r_{2}=1.5$ , $h=1/5,000$ and $u$ varies from $0.41$ to $0.42$ . Recall that $r_{h}=h^{2u}$ .

We notice that the threshold estimate deviates strongly from the theoretical co-integrated volatility. Figure 8 shows that the threshold estimate is centered around $C^{12}$ when $u=0,412$ . The optimal choice of the threshold is not trivial. To sum up, it is easier to tune up the parameters for the spectral estimate rather than the threshold for the truncated estimate.

Acknowledgement

The author is very grateful to an anonymous referee for many helpful questions and remarks that have led to considerable improvements and to Markus Rei $\ss$ for stimulating comments and discussions. This work was partially supported by Deutsche Forschungsgemeinschaft via IRTG 1792 High Dimensional Nonstationary Time Series.

Bibliography40

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Aït-Sahalia and Jacod (2011) Y. Aït-Sahalia and J. Jacod. Testing whether jumps have finite or infinite activity. The Annals of Statistics , 2011.
2Andersen and Bollerslev (1998) T. Andersen and T. Bollerslev. Answering the skeptics: Yes, standard volatility models do provide accurate forecasts. International economic review , 1998.
3Barndorff-Nielsen and Shephard (2002) O. Barndorff-Nielsen and N. Shephard. Econometric analysis of realized volatility and its use in estimating stochastic volatility models. Journal of the Royal Statistical Society: Series B (Statistical Methodology) , 2002.
4Barndorff-Nielsen and Shephard (2004) O. Barndorff-Nielsen and N. Shephard. Power and bipower variation with stochastic volatility and jumps. Journal of financial econometrics , 2004.
5Barndorff-Nielsen and Shephard (2006) O. Barndorff-Nielsen and N. Shephard. Econometrics of testing for jumps in financial economics using bipower variation. Journal of financial Econometrics , 2006.
6Belomestny (2010) D. Belomestny. Spectral estimation of the fractional order of a Lévy process. The Annals of Statistics , 2010.
7Belomestny and Panov (2013 a) D. Belomestny and V. Panov. Abelian theorems for stochastic volatility models with application to the estimation of jump activity of volatility. Stochastic Processes and their Applications , 2013 a.
8Belomestny and Panov (2013 b) D. Belomestny and V. Panov. Estimation of the activity of jumps in time-changed Lévy models. Electronic Journal of Statistics , 2013 b.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Minimax rates for the covariance estimation of multi-dimensional Lévy processes with high-frequency data

Abstract

keywords:

keywords:

1 Introduction

2 The underlying model

3 Assumptions

Assumption (1-M)**.

Assumption (2-M)**.

Example 3.1**.**

Example 3.2**.**

4 Theoretical results

Definition 4.1**.**

Theorem 4.2**.**

Theorem 4.3**.**

5 Co-jump index activity

Definition 5.1**.**

Remark 5.2*.*

Lemma 5.3**.**

Proof.

Example 5.4**.**

6 Upper Bound

6.1 The bias-variance decomposition

Lemma 6.1**.**

Proof.

6.2 Bounding the deterministic error

Lemma 6.2**.**

Proof.

6.3 Bounding the stochastic error

Definition 6.3**.**

Lemma 6.4**.**

Proof.

Lemma 6.5**.**

Proof.

Remark 6.6*.*

End proof of Theorem 4.2

7 Lower Bound

Definition 7.1**.**

Lemma 7.2**.**

Proof.

7.1 Two-hypothesis test

Property 1**.**

Property 2**.**

7.2 Construction of the co-jump measure in R2\mathbb{R}^{2}R2

Definition 7.3**.**

Lemma 7.4**.**

Proof.

Proposition 7.5**.**

Proof.

7.3 Characteristic functions of X1/n\textbf{X}_{1/n}X1/n​ and Y1/n\textbf{Y}_{1/n}Y1/n​

7.4 Total variation distance

Lemma 7.6**.**

Proof.

End proof of Theorem 4.3

8 Discussion

8.1 Numerical experiments.

Acknowledgement

*Assumption (1-M)**.*

*Assumption (2-M)**.*

Example 3.1.

Example 3.2.

Definition 4.1.

Theorem 4.2.

Theorem 4.3.

Definition 5.1.

*Remark 5.2**.*

Lemma 5.3.

Example 5.4.

Lemma 6.1.

Lemma 6.2.

Definition 6.3.

Lemma 6.4.

Lemma 6.5.

*Remark 6.6**.*

Definition 7.1.

Lemma 7.2.

Property 1.

Property 2.

7.2 Construction of the co-jump measure in $\mathbb{R}^{2}$

Definition 7.3.

Lemma 7.4.

Proposition 7.5.

7.3 Characteristic functions of $\textbf{X}_{1/n}$ and $\textbf{Y}_{1/n}$

Lemma 7.6.