# On a spiked model for large volatility matrix estimation from noisy   high-frequency data

**Authors:** Keren Shen, Jianfeng Yao, Wai Keung Li

arXiv: 1702.03417 · 2017-02-14

## TL;DR

This paper introduces a spiked model for estimating large volatility matrices from noisy high-frequency data, improving inference accuracy by accounting for eigenvalue spikes and validated through simulations and real market data.

## Contribution

It proposes a novel spiked model for integrated covariance matrices that captures eigenvalue spikes, with proven consistency and superior empirical performance.

## Key findings

- The model accurately infers eigenvalue spikes in high-dimensional covariance matrices.
- Simulation studies confirm the consistency of the inference procedure.
- Application to market data demonstrates improved prediction of spikes and matrix approximation.

## Abstract

Recently, inference about high-dimensional integrated covariance matrices (ICVs) based on noisy high-frequency data has emerged as a challenging problem. In the literature, a pre-averaging estimator (PA-RCov) is proposed to deal with the microstructure noise. Using the large-dimensional random matrix theory, it has been established that the eigenvalue distribution of the PA-RCov matrix is intimately linked to that of the ICV through the Marcenko-Pastur equation. Consequently, the spectrum of the ICV can be inferred from that of the PA-RCov. However, extensive data analyses demonstrate that the spectrum of the PA-RCov is spiked, that is, a few large eigenvalues (spikes) stay away from the others which form a rather continuous distribution with a density function (bulk). Therefore, any inference on the ICVs must take into account this spiked structure. As a methodological contribution, we propose a spiked model for the ICVs where spikes can be inferred from those of the available PA-RCov matrices. The consistency of the inference procedure is established and is checked by extensive simulation studies. In addition, we apply our method to the real data from the US and Hong Kong markets. It is found that our model clearly outperforms the existing one in predicting the existence of spikes and in mimicking the empirical PA-RCov matrices.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1702.03417/full.md

## Figures

23 figures with captions in the complete paper: https://tomesphere.com/paper/1702.03417/full.md

## References

36 references — full list in the complete paper: https://tomesphere.com/paper/1702.03417/full.md

---
Source: https://tomesphere.com/paper/1702.03417