Robust High-Dimensional Time-Varying Coefficient Estimation
Minseok Shin, Donggyu Kim

TL;DR
This paper introduces RED-LASSO, a robust method for high-dimensional, time-varying coefficient estimation using high-frequency data, effectively handling heavy tails and sparsity.
Contribution
The paper develops a novel robust estimation procedure combining Huber loss, debiasing, and thresholding for high-dimensional, time-varying models with heavy-tailed data.
Findings
Achieves near-optimal convergence rates.
Successfully applied to high-frequency trading data.
Handles heavy tails and coefficient sparsity effectively.
Abstract
In this paper, we develop a novel high-dimensional coefficient estimation procedure based on high-frequency data. Unlike usual high-dimensional regression procedures such as LASSO, we additionally handle the heavy-tailedness of high-frequency observations as well as time variations of coefficient processes. Specifically, we employ the Huber loss and a truncation scheme to handle heavy-tailed observations, while -regularization is adopted to overcome the curse of dimensionality. To account for the time-varying coefficient, we estimate local coefficients which are biased due to the -regularization. Thus, when estimating integrated coefficients, we propose a debiasing scheme to enjoy the law of large numbers property and employ a thresholding scheme to further accommodate the sparsity of the coefficients. We call this Robust thrEsholding Debiased LASSO (RED-LASSO)…
| In-sample | ||||||
| Estimator | ||||||
| RED-LASSO | ED-LASSO | LASSO | ||||
| whole period | 0.261 | 0.196 | 0.220 | |||
| 2013 | 0.254 | 0.151 | 0.202 | |||
| 2014 | 0.233 | 0.201 | 0.187 | |||
| 2015 | 0.282 | 0.272 | 0.257 | |||
| 2016 | 0.267 | 0.085 | 0.214 | |||
| 2017 | 0.206 | 0.137 | 0.158 | |||
| 2018 | 0.339 | 0.335 | 0.315 | |||
| 2019 | 0.247 | 0.191 | 0.208 | |||
| Out-of-sample | ||||||
| Estimator | ||||||
| RED-LASSO | ED-LASSO | LASSO | ||||
| whole period | 0.248 | 0.167 | 0.216 | |||
| 2014 | 0.214 | 0.136 | 0.181 | |||
| 2015 | 0.270 | 0.234 | 0.245 | |||
| 2016 | 0.248 | 0.082 | 0.210 | |||
| 2017 | 0.194 | 0.094 | 0.152 | |||
| 2018 | 0.329 | 0.281 | 0.304 | |||
| 2019 | 0.231 | 0.173 | 0.202 | |||
| Type | Symbol | AAPL | BRK.B | GM | GOOG | XOM | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RED | ED | LASSO | RED | ED | LASSO | RED | ED | LASSO | RED | ED | LASSO | RED | ED | LASSO | ||||||||
| Commodity | CA | 0 | 20 | 0 | 0 | 22 | 0 | 0 | 27 | 0 | 1 | 28 | 0 | 0 | 29 | 0 | ||||||
| CL | 0 | 15 | 0 | 0 | 15 | 2 | 1 | 21 | 1 | 1 | 15 | 0 | 11 | 34 | 48 | |||||||
| GC | 1 | 17 | 0 | 2 | 11 | 0 | 4 | 25 | 0 | 2 | 15 | 0 | 0 | 25 | 1 | |||||||
| HG | 1 | 16 | 0 | 1 | 23 | 0 | 1 | 19 | 1 | 4 | 18 | 0 | 3 | 16 | 4 | |||||||
| HO | 0 | 12 | 0 | 0 | 20 | 0 | 3 | 12 | 1 | 0 | 7 | 0 | 2 | 14 | 42 | |||||||
| ML | 0 | 20 | 0 | 0 | 15 | 0 | 0 | 22 | 0 | 1 | 17 | 0 | 0 | 12 | 1 | |||||||
| NG | 2 | 8 | 0 | 0 | 9 | 0 | 1 | 10 | 0 | 0 | 3 | 0 | 0 | 6 | 1 | |||||||
| OJ | 1 | 11 | 0 | 0 | 9 | 0 | 0 | 15 | 0 | 0 | 23 | 0 | 1 | 15 | 0 | |||||||
| PA | 0 | 9 | 0 | 1 | 7 | 0 | 1 | 11 | 0 | 1 | 13 | 0 | 0 | 11 | 1 | |||||||
| PL | 2 | 7 | 0 | 1 | 14 | 0 | 0 | 22 | 0 | 2 | 14 | 0 | 0 | 15 | 1 | |||||||
| RB | 1 | 9 | 0 | 2 | 15 | 0 | 2 | 14 | 1 | 0 | 17 | 0 | 2 | 12 | 36 | |||||||
| RM | 0 | 15 | 0 | 0 | 14 | 0 | 0 | 15 | 0 | 0 | 12 | 0 | 0 | 10 | 0 | |||||||
| RS | 0 | 9 | 0 | 0 | 10 | 0 | 0 | 7 | 0 | 0 | 7 | 0 | 0 | 6 | 0 | |||||||
| SI | 0 | 18 | 0 | 1 | 16 | 0 | 3 | 13 | 0 | 2 | 18 | 0 | 0 | 17 | 1 | |||||||
| ZC | 0 | 21 | 0 | 1 | 19 | 0 | 3 | 26 | 0 | 0 | 16 | 0 | 0 | 16 | 0 | |||||||
| ZL | 1 | 19 | 0 | 1 | 15 | 0 | 0 | 20 | 0 | 2 | 15 | 0 | 0 | 19 | 1 | |||||||
| ZM | 1 | 13 | 0 | 1 | 17 | 0 | 1 | 19 | 0 | 1 | 19 | 0 | 2 | 14 | 0 | |||||||
| ZO | 0 | 10 | 0 | 0 | 16 | 0 | 1 | 16 | 0 | 1 | 19 | 0 | 1 | 16 | 0 | |||||||
| ZR | 0 | 12 | 0 | 0 | 14 | 0 | 0 | 12 | 0 | 0 | 16 | 0 | 0 | 17 | 0 | |||||||
| ZW | 1 | 15 | 0 | 0 | 13 | 0 | 0 | 16 | 0 | 3 | 23 | 0 | 0 | 12 | 0 | |||||||
| Currency | A6 | 1 | 15 | 0 | 2 | 23 | 1 | 1 | 24 | 2 | 1 | 17 | 0 | 1 | 11 | 6 | ||||||
| AD | 0 | 20 | 0 | 2 | 13 | 0 | 2 | 14 | 2 | 3 | 11 | 0 | 4 | 18 | 13 | |||||||
| B6 | 0 | 19 | 0 | 0 | 17 | 0 | 4 | 23 | 0 | 1 | 20 | 0 | 1 | 21 | 0 | |||||||
| BR | 0 | 6 | 0 | 0 | 14 | 0 | 0 | 10 | 0 | 0 | 13 | 0 | 1 | 11 | 1 | |||||||
| DX | 3 | 25 | 1 | 0 | 15 | 0 | 1 | 26 | 0 | 0 | 16 | 0 | 1 | 16 | 1 | |||||||
| E1 | 1 | 15 | 0 | 2 | 17 | 0 | 2 | 22 | 0 | 1 | 18 | 0 | 0 | 14 | 0 | |||||||
| E6 | 1 | 16 | 0 | 0 | 26 | 0 | 0 | 25 | 0 | 0 | 15 | 0 | 1 | 17 | 0 | |||||||
| J1 | 2 | 18 | 2 | 2 | 18 | 9 | 3 | 23 | 5 | 1 | 20 | 1 | 3 | 23 | 3 | |||||||
| RP | 0 | 15 | 0 | 0 | 11 | 0 | 1 | 24 | 0 | 0 | 14 | 0 | 0 | 21 | 0 | |||||||
| RU | 0 | 7 | 0 | 0 | 9 | 0 | 0 | 9 | 0 | 0 | 11 | 0 | 1 | 8 | 1 | |||||||
| Interest rate | BTP | 0 | 39 | 0 | 1 | 33 | 0 | 2 | 44 | 0 | 0 | 30 | 0 | 0 | 30 | 0 | ||||||
| ED | 0 | 2 | 0 | 0 | 4 | 0 | 0 | 10 | 0 | 0 | 3 | 0 | 0 | 7 | 0 | |||||||
| G | 0 | 46 | 0 | 2 | 47 | 0 | 2 | 39 | 1 | 1 | 41 | 0 | 1 | 44 | 0 | |||||||
| GG | 0 | 27 | 0 | 0 | 19 | 2 | 2 | 27 | 1 | 3 | 20 | 0 | 0 | 27 | 0 | |||||||
| HR | 0 | 9 | 0 | 2 | 14 | 0 | 1 | 14 | 0 | 3 | 16 | 0 | 0 | 17 | 0 | |||||||
| US | 1 | 15 | 1 | 0 | 9 | 5 | 3 | 21 | 1 | 1 | 14 | 0 | 3 | 15 | 1 | |||||||
| ZF | 1 | 14 | 0 | 2 | 12 | 4 | 0 | 19 | 1 | 0 | 10 | 0 | 3 | 14 | 0 | |||||||
| ZN | 0 | 10 | 1 | 0 | 13 | 4 | 1 | 13 | 2 | 1 | 13 | 0 | 2 | 15 | 0 | |||||||
| ZQ | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | |||||||
| ZT | 1 | 14 | 1 | 0 | 9 | 0 | 0 | 8 | 2 | 1 | 10 | 0 | 0 | 3 | 0 | |||||||
| Stock market index | DY | 3 | 34 | 18 | 0 | 38 | 26 | 3 | 40 | 18 | 3 | 30 | 10 | 3 | 43 | 25 | ||||||
| ES | 53 | 45 | 81 | 71 | 71 | 84 | 14 | 36 | 67 | 64 | 54 | 81 | 30 | 35 | 76 | |||||||
| EW | 8 | 42 | 18 | 20 | 35 | 56 | 28 | 49 | 58 | 6 | 46 | 13 | 6 | 43 | 41 | |||||||
| FX | 1 | 25 | 9 | 2 | 35 | 19 | 3 | 27 | 20 | 1 | 26 | 5 | 0 | 21 | 27 | |||||||
| MME | 18 | 25 | 28 | 8 | 24 | 33 | 12 | 29 | 26 | 26 | 34 | 23 | 8 | 25 | 51 | |||||||
| MX | 1 | 35 | 17 | 1 | 24 | 32 | 3 | 30 | 24 | 4 | 25 | 7 | 1 | 27 | 38 | |||||||
| NQ | 84 | 84 | 84 | 14 | 40 | 37 | 8 | 40 | 36 | 84 | 84 | 84 | 13 | 51 | 21 | |||||||
| RTY | 9 | 26 | 27 | 4 | 28 | 27 | 14 | 34 | 48 | 5 | 39 | 24 | 3 | 29 | 22 | |||||||
| VX | 14 | 34 | 29 | 14 | 24 | 37 | 5 | 20 | 21 | 21 | 30 | 23 | 5 | 15 | 29 | |||||||
| X | 3 | 26 | 11 | 6 | 26 | 32 | 7 | 30 | 26 | 4 | 27 | 5 | 12 | 43 | 58 | |||||||
| XAE | 0 | 19 | 0 | 1 | 17 | 1 | 2 | 22 | 5 | 0 | 20 | 0 | 66 | 71 | 57 | |||||||
| XAF | 1 | 17 | 3 | 39 | 50 | 35 | 4 | 27 | 12 | 2 | 16 | 0 | 1 | 18 | 11 | |||||||
| XAI | 2 | 23 | 2 | 2 | 24 | 4 | 1 | 18 | 5 | 0 | 15 | 0 | 2 | 17 | 6 | |||||||
| YM | 43 | 60 | 62 | 67 | 60 | 84 | 13 | 31 | 56 | 6 | 27 | 50 | 55 | 63 | 84 | |||||||
| Six factors | HML | 9 | 23 | 0 | 28 | 37 | 12 | 26 | 50 | 9 | 22 | 38 | 1 | 23 | 45 | 42 | ||||||
| SMB | 8 | 30 | 0 | 65 | 68 | 19 | 10 | 29 | 1 | 5 | 22 | 0 | 57 | 63 | 19 | |||||||
| RMW | 2 | 13 | 0 | 25 | 44 | 7 | 37 | 51 | 4 | 8 | 24 | 0 | 59 | 65 | 49 | |||||||
| CMA | 2 | 18 | 2 | 5 | 24 | 0 | 11 | 35 | 4 | 18 | 32 | 5 | 45 | 56 | 39 | |||||||
| MOM | 6 | 30 | 2 | 29 | 46 | 11 | 40 | 54 | 23 | 22 | 37 | 5 | 63 | 72 | 39 | |||||||
| MKT | 13 | 51 | 32 | 84 | 83 | 84 | 82 | 82 | 83 | 19 | 35 | 32 | 84 | 84 | 84 | |||||||
| AAPL | BRK.B | GM | GOOG | XOM | ||||||||||||||||
| RED | ED | LASSO | RED | ED | LASSO | RED | ED | LASSO | RED | ED | LASSO | RED | ED | LASSO | ||||||
| Non-zero frequency | 3.595 | 15.095 | 5.130 | 6.083 | 16.845 | 7.940 | 4.392 | 17.511 | 6.750 | 4.261 | 15.333 | 4.392 | 6.904 | 18.261 | 11.678 | |||||
| Type | Symbol | Description | ||
|---|---|---|---|---|
| Commodity | CA | Cocoa | ||
| CL | Crude Oil WTI | |||
| GC | Gold | |||
| HG | Copper | |||
| HO | NY Harbor ULSD (Heating Oil) | |||
| ML | Milling Wheat | |||
| NG | Henry Hub Natural Gas | |||
| OJ | Orange Juice | |||
| PA | Palladium | |||
| PL | Platinum | |||
| RB | RBOB Gasoline | |||
| RM | Robusta Coffee | |||
| RS | Canola | |||
| SI | Silver | |||
| ZC | Corn | |||
| ZL | Soybean Oil | |||
| ZM | Soybean Meal | |||
| ZO | Oats | |||
| ZR | Rough Rice | |||
| ZW | Wheat | |||
| Currency | A6 | Australian Dollar | ||
| AD | Canadian Dollar | |||
| B6 | British Pound | |||
| BR | Brazilian Real | |||
| DX | US Dollar Index | |||
| E1 | Swiss Franc | |||
| E6 | Euro FX | |||
| J1 | Japanese Yen | |||
| RP | Euro/British Pound | |||
| RU | Russian Ruble | |||
| Interest rate | BTP | Euro BTP Long-Bond | ||
| ED | Eurodollar | |||
| G | 10-Year Long Gilt | |||
| GG | Euro Bund | |||
| HR | Euro Bobl | |||
| US | 30-Year US Treasury Bond | |||
| ZF | 5-Year US Treasury Note | |||
| ZN | 10-Year US Treasury Note | |||
| ZQ | 30-Day Fed Funds | |||
| ZT | 2-Year US Treasury Note | |||
| Stock market index | DY | DAX | ||
| ES | E-mini S&P 500 | |||
| EW | E-mini S&P 500 Midcap | |||
| FX | Euro Stoxx 50 | |||
| MME | MSCI Emerging Markets Index | |||
| MX | CAC 40 | |||
| NQ | E-mini Nasdaq 100 | |||
| RTY | E-mini Russell 2000 | |||
| VX | VIX | |||
| X | FTSE 100 | |||
| XAE | E-mini Energy Select Sector | |||
| XAF | E-mini Financial Select Sector | |||
| XAI | E-mini Industrial Select Sector | |||
| YM | E-mini Dow |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic processes and financial applications · Risk and Portfolio Optimization · Monetary Policy and Economic Impact
\floatsetup
[table]capposition=top
Robust High-Dimensional Time-Varying Coefficient Estimation
Minseok Shin and Donggyu Kim***Corresponding author. Address: College of Business, KAIST, Seoul 02455, South Korea. E-mail: [email protected].
Korea Advanced Institute of Science and Technology (KAIST)
Abstract
In this paper, we develop a novel high-dimensional coefficient estimation procedure based on high-frequency data. Unlike usual high-dimensional regression procedure such as LASSO, we additionally handle the heavy-tailedness of high-frequency observations as well as time variations of coefficient processes. Specifically, we employ Huber loss and truncation scheme to handle heavy-tailed observations, while -regularization is adopted to overcome the curse of dimensionality. To account for the time-varying coefficient, we estimate local coefficients which are biased due to the -regularization. Thus, when estimating integrated coefficients, we propose a debiasing scheme to enjoy the law of large number property and employ a thresholding scheme to further accommodate the sparsity of the coefficients. We call this Robust thrEsholding Debiased LASSO (RED-LASSO) estimator. We show that the RED-LASSO estimator can achieve a near-optimal convergence rate. In the empirical study, we apply the RED-LASSO procedure to the high-dimensional integrated coefficient estimation using high-frequency trading data.
Keywords: Debias, diffusion process, LASSO, factor model, sparsity, Huber loss, heavy-tail.
1 Introduction
With the wide availability of high-frequency financial data, researchers have developed financial models that can incorporate high-frequency data, and the empirical studies have shown that these models better account for market dynamics. For example, auto-regressive-type models have been introduced based on high-frequency-based measures, such as realized volatility and realized beta estimators (Andersen et al.,, 2006; Corsi,, 2009; Engle and Gallo,, 2006; Hansen et al.,, 2012; Kim and Wang,, 2016; Kim and Fan,, 2019; Shephard and Sheppard,, 2010; Song et al.,, 2021). Empirical studies have demonstrated that capturing the auto-regressive structures of high-frequency measures helps explain financial market dynamics. On the other hand, we often employ the realized volatility estimators when analyzing regression models, such as the Capital Asset Pricing Model (CAPM) (Lintner,, 1965; Sharpe,, 1964) and multi-factor models (Fama and French,, 1992). For example, market beta can be estimated by a ratio of the realized covariance between assets and systematic factors to the realized variance of the systematic factors (Barndorff-Nielsen and Shephard,, 2004). See Andersen et al., (2006); Mykland and Zhang, (2009); Reiß et al., (2015) for the related literatures. Li et al., (2017) derived the asymptotic efficiency bound for betas in a linear continuous-time regression model. Furthermore, to handle the time-varying feature of beta process (Ferson and Harvey,, 1999; Kalnina,, 2022; Reiß et al.,, 2015), Aït-Sahalia et al., (2020) employed time-localized regressions for the multi-factor models. Chen, (2018) introduced the general nonparametric inference for nonlinear volatility functionals of general multivariate Itô semimartingales. These models and estimation methods have shown that incorporating high-frequency data helps better account for the beta dynamics in the finite dimensional set-up.
In modern financial studies and practices, researchers have found a large number of factor candidates (Bali et al.,, 2011; Campbell et al.,, 2008; Cochrane,, 2011; Harvey et al.,, 2016; Hou et al.,, 2020; McLean and Pontiff,, 2016). Thus, we often encounter the curse of dimensionality, and the beta estimation methods designed for the finite dimension are neither efficient nor effective. To handle the high-dimensionality, we often employ LASSO (Tibshirani,, 1996), SCAD (Fan and Li,, 2001), and the Dantzig selector (Candes and Tao,, 2007) under the sparsity condition of model parameters. However, direct application of these methods cannot handle the time-varying feature of beta processes. Recently, Kim and Shin, (2022) developed a Thresholded dEbiased Dantzig (TED) estimator that can handle the high-dimensionality and time variation of beta processes. Specifically, they employed the Dantzig selector (Candes and Tao,, 2007) for each time window and estimated the integrated beta with the debiasing and truncation schemes. They established the asymptotic properties of the TED estimator under the sub-Gaussianity assumption on the high-frequency log-return data. However, we often observe that the high-frequency financial data exhibit heavy tails (Cont,, 2001; Fan and Kim,, 2018; Mao and Zhang,, 2018; Shin et al.,, 2023). Under the heavy-tailedness assumption, the existing estimation methods, including the TED estimator (Kim and Shin,, 2022), cannot consistently estimate the time-varying betas. These facts lead to the demand for developing methodologies that can simultaneously handle heavy-tailed observations, the curse of dimensionality, and time-varying beta processes.
In this paper, we develop a robust integrated beta estimator based on high-dimensional regression jump-diffusion processes. To handle the high-dimensionality and time-varying beta, we assume that the beta processes are sparse and follow a continuous diffusion process. To account for the heavy-tailedness of financial data, we assume that the residual process and jump size processes satisfy the finite th moment condition for . That is, we assume that the sources of the heavy-tailedness are the residual process and jump. We first estimate the instantaneous betas as follows. We employ the -penalty, Huber loss, and truncation method to manage the curse of dimensionality, heavy-tailedness of the residual process, and jumps, respectively. We show that the proposed instantaneous beta estimator has the desirable convergence rate. However, the instantaneous beta estimator has non-negligible biases coming from the Huber loss and -penalty. Thus, to estimate the integrated beta using the instantaneous beta estimators, we need to mitigate the biases. Since the biases are heavy-tailed, the existing debiasing scheme cannot efficiently adjust the biases. To tackle this problem, we propose a novel debiasing scheme and obtain an integrated beta estimator. We show that the debiased integrated beta estimator has a near-optimal convergence rate and outperforms the simple integration of the instantaneous beta estimators without a debiasing scheme. However, due to the bias adjustment, the debiased integrated beta estimator is not sparse; thus, we further regularize it to accommodate the sparsity. We call this the Robust thrEsholding Debiased LASSO (RED-LASSO) estimator. We also show that the RED-LASSO estimator has a near-optimal convergence rate.
The rest of paper is organized as follows. Section 2 introduces the high-dimensional regression jump-diffusion process. Section 3 proposes the RED-LASSO estimator and establishes its asymptotic properties. In Section 4, we conduct a simulation study to check the finite sample performance of the proposed estimation method. In Section 5, we apply the proposed estimation procedure to high-frequency financial data. The conclusion is presented in Section 6, and all of the proofs are collected in the Appendix.
2 The model set-up
We first fix some notations. For any given by matrix , let
[TABLE]
The Frobenius norm of is denoted by and the matrix spectral norm is the square root of the largest eigenvalue of . We will use ’s to denote generic constants whose values are free of and and may change from appearance to appearance.
Let and be the dependent process and -dimensional multivariate covariate process, respectively. We employ the following non-parametric time-series regression jump-diffusion model:
[TABLE]
where and are the continuous parts of and , respectively, is the jump part of , is a jump size, is a Poisson process with a bounded intensity process, is a coefficient process, and is a residual process. We note that the subscript represents the continuous part of the process. The covariate process and residual process satisfy
[TABLE]
where is the jump part of , is a jump size process, is a -dimensional Poisson process with bounded intensity processes, is a by matrix, and and are -dimensional and one-dimensional independent Brownian motions, respectively. The stochastic processes , , , and are defined on a filtered probability space with filtration satisfying the usual conditions, such as adapted and càdlàg process. We assume that the coefficient satisfies the following diffusion model:
[TABLE]
where is a by matrix, is a -dimensional independent Brownian motion, and and are predictable. The main interest of this paper is to investigate the latent regression diffusion process. In this point of view, the jump part can be considered as noises, and we discuss how to overcome this in the following section. The parameter of interest is the integrated beta:
[TABLE]
The integrated beta can be considered as the average of spot betas. That is, the integrated beta presents the average effect of the increment of the covariate process. When the beta process is constant, the integrated beta is the same as the usual beta in the regression model.
In the regression-based financial models, there are hundreds of potential factor candidates (Bali et al.,, 2011; Campbell et al.,, 2008; Cochrane,, 2011; Harvey et al.,, 2016; Hou et al.,, 2020; McLean and Pontiff,, 2016). To account for this, we allow the dimension can be large; thus, we need to handle the curse of dimensionality. To do this, we assume that the coefficient beta process satisfies the following sparsity condition:
[TABLE]
where , is diverging slowly in , and is defined as 0. This general sparsity condition includes the exact sparsity condition, i.e., . We note that the exact sparsity condition implies that only several factors are significant, while most factors do not affect the dependent process. Thus, intuitively, we assume that the relatively small number of factors are significant. We note that since the beta process is an Itô diffusion process, in general, the boundedness in the sparsity condition (2.5) is satisfied with high probability. However, for simplicity, we assume the almost sure boundedness.
3 Robust high-dimensional high-frequency regression
3.1 Integrated beta estimation procedure
In this section, we propose a robust integrated beta estimation procedure for the high-dimensional regression diffusion model defined in (2.1)–(2.3). Recently, with the sub-Gaussian assumption, Kim and Shin, (2022) proposed the integrated beta estimator that can handle the curse of dimensionality and time-varying beta. However, empirical studies have demonstrated that the stock log-return data often exhibit heavy-tails (Cont,, 2001; Fan and Kim,, 2018; Mao and Zhang,, 2018; Shin et al.,, 2023). To account for this, we impose the finite moment condition for the residual process, , and jump sizes, and (see Assumption 1). Then, we propose a robust estimation procedure. We first estimate the instantaneous betas. To do this, we employ the local regression as follows. For any process and , let for . Define
[TABLE]
where is the number of observations for each local regression, is an indicator function, and , , are the threshold levels. We use for some large constants , . In the numerical study, we choose
[TABLE]
where the bipower variation . This choice of is similar to the usual choice in the literatures (Aït-Sahalia et al.,, 2020; Aït-Sahalia and Xiu,, 2019). We note that the thresholding can detect the jumps in the covariate process and mitigate their impact on beta estimators. On the other hand, the thresholding is not used for the dependent process since the robustification method outlined in (3.3) and (3.5) can handle both heavy-tailedness of the residual process and jumps in the dependent process . Meanwhile, when calculating local regressions, we need to handle the curse of dimensionality and heavy-tailedness. To overcome high-dimensionality, we often employ the penalized regression procedure under the sparsity assumption. For example, we often use the LASSO (Tibshirani,, 1996) and Dantzig (Candes and Tao,, 2007) estimators with the sub-Gaussian conditions. However, these estimators cannot handle the heavy-tailed observations, and furthermore, they are not consistent. To tackle this issue, we use the following Huber loss (Huber,, 1964):
[TABLE]
where is the robustification parameter. We denote for any vector . The Huber loss mitigates the effect of outliers coming from the heavy-tailedness of the residual process and jump size process . Thus, by employing the truncation, Huber loss, and -regularization, we can simultaneously deal with the three issues of the jumps, heavy-tailedness, and curse of dimensionality. Specifically, we propose the following instantaneous beta estimator at time :
[TABLE]
where is the regularization parameter, and the empirical loss function is
[TABLE]
In Theorem 1, we show that the proposed instantaneous beta estimator is consistent with appropriate and . Then, we can estimate the integrated beta using the integration of ’s. However, their integration cannot enjoy the law of large number properties since each is biased due to the regularization term. That is, the error of their integration is dominated by the bias terms, which leads to the same convergence rate as that of . Thus, to reduce the effect of the bias and obtain faster convergence rate, we propose a debiasing scheme as follows. First, we estimate the inverse instantaneous volatility matrix at time , , where . Specifically, we use the following constrained -minimization for inverse matrix estimation (CLIME) (Cai et al.,, 2011):
[TABLE]
where is the tuning parameter, which will be specified in Theorem 2. With the inverse volatility matrix estimator , we usually adjust the instantaneous beta estimator as follows:
[TABLE]
This debiasing scheme performs well under the sub-Gaussian assumption (Javanmard and Montanari,, 2014, 2018; Kim and Shin,, 2022; Van de Geer et al.,, 2014). However, has only finite th moment for ; thus, the debiased instantaneous beta estimator has the heavy-tails. To handle this issue, we employ the Winsorization method as follows. Define the truncation (Winsorization) function
[TABLE]
where is a truncation parameter and denote for any vector . Using this truncation function, we adjust as
[TABLE]
where the truncation parameter will be specified in Theorem 2. We note that for the debiasing step, we use the non-overlapping window for and , which helps enjoy the martingale property. Specifically, since is measurable at time , we can handle the noises from and using the martingale convergence theorem. We also note that the purpose of the debiasing is to enjoy the law of large number property when obtaining the integrated beta estimator. Usually, the debiasing scheme is employed to obtain the asymptotic normality, which enables the hypothesis test or confidence interval construction (Javanmard and Montanari,, 2014, 2018; Van de Geer et al.,, 2014; Zhang and Zhang,, 2014). However, in this paper, we do not focus on this issue and mainly focus on the integrated beta estimation. Then, the integrated beta estimator is defined as follows:
[TABLE]
The debiased LASSO integrated beta estimator can achieve a faster convergence rate than the simple integration of the instantaneous beta estimators. However, due to the bias adjustment term, it cannot account for the sparsity structure of the integrated beta. To accommodate the sparsity, we employ the following thresholding scheme:
[TABLE]
where the thresholding function satisfies and is a thresholding level, which will be specified in Theorem 3. For example, we can employ the hard thresholding function or soft thresholding function . In the empirical study, we used the hard thresholding function . We call this the Robust thrEsholding Debiased LASSO (RED-LASSO) estimator. We describe the RED-LASSO estimation procedure in Algorithm 1.
3.2 Theoretical results
In this section, we investigate asymptotic properties of the proposed RED-LASSO estimation procedure. To investigate the theoretical properties, we make the following assumptions.
Assumption 1**.**
- (a)
The residual process and jump size processes, and , satisfy, for some ,
[TABLE]
- (b)
The processes , , , , and are almost surely entry-wise bounded, and ** a.s.**
- (c)
The processes and satisfy the following sparsity condition for :
[TABLE]
- (d)
* for some positive constants , , and , and as .*
- (e)
*Define , where is the subvector obtained by stacking , is the subvector obtained by stacking , is the subvector obtained by stacking , and *
of . Then, there exists a positive constant such that the following inequality holds for some and , where the specific value of is given in Theorem 1:
[TABLE]
- (f)
The volatility process satisfies the following condition:
[TABLE]
Remark 1**.**
Assumption 1(a) is the finite moment condition, which implies that the dependent process , covariate process , and residual process have heavy-tails. We note that the moment condition for is satisfied when is an independent random variable and , or \sup_{0\leq t\leq 1}\sup_{t\leq s\leq 1}\mathbb{E}\left\{|\nu(s)|^{\gamma}\Big{|}\mathcal{F}_{t}\right\}\leq C\text{ a.s.} The latter condition can be satisfied when consists of the bounded continuous process and independent jump process. The boundedness condition Assumption 1(b) implies the sub-Gaussianity for the continuous part of the covariate process, , and target parameter, , which are often required to investigate high-dimensional inferences. However, the boundedness condition can be relaxed to the locally boundedness condition by Lemma 4.4.9 in Jacod and Protter, (2011). Specifically, if the asymptotic result, such as stable convergence in law or convergence in probability, is satisfied under the boundedness condition, it is also satisfied under the locally boundedness condition. On the other hand, for the continuous-time regression model, we usually assume that the smallest eigenvalue of is bounded from below, which implies that the largest eigenvalue of is bounded. In this point of view, the condition is not restrictive. Even if this condition is replaced by the sparsity condition , where , and and are the sparsity related variables, the difference in theoretical results is up to order. Assumption 1(c) is the sparsity condition for the beta process, which is required to investigate the discretization error when estimating instantaneous betas. Assumption 1(e) is the eigenvalue condition for the Hessian matrix , which is called the localized restricted eigenvalue () condition (Fan et al.,, 2018; Sun et al.,, 2020). This implies strictly positive restricted eigenvalues over a local neighborhood. We note that converges to zero for the choice of in Theorems 1–2. When the coefficient process satisfies the exact sparsity condition, i.e., , is replaced by a -cone , where . Finally, we need the continuity condition Assumption 1(f) to investigate asymptotic behaviors of the CLIME estimator. We note that this condition is obtained with high probability when follows a continuous Itô diffusion process with bounded drift and instantaneous volatility processes.
The following theorem derives the asymptotic properties of instantaneous beta estimator . Note that the subscript [math] represents the true parameters.
Theorem 1**.**
*Under Assumption 1(a)–(e), let for some constants and . For any given positive constant , choose and \eta=C_{\eta,a}\Big{[}s_{p}n^{-3/2}\sqrt{k_{n}\log p}
+n^{-1}k_{n}^{-1/2}(\log p)^{3/4}\Big{]} for some large constants and . Then, we have, for large ,*
[TABLE]
with probability greater than .
Remark 2**.**
Theorem 1 shows the and norm error bounds of the instantaneous beta estimator. We note that as increases, the statistical estimation error decreases and time variation approximation error increases. To achieve the optimality, we choose , which implies that these two errors have the same convergence rate. Then, the instantaneous beta estimator has the convergence rate of and convergence rate of with the order and sparsity level terms.
To estimate the integrated beta, we can use the integration of the instantaneous beta estimators. However, as discussed in Section 3.1, it cannot enjoy the law of large number property due to the heavy-tailed biases. To tackle this problem, we employ the robust debiasing method (3.5) and obtain the debiased LASSO integrated beta estimator in (3.6). The following theorem establishes the asymptotic behaviors of .
Theorem 2**.**
Under the assumptions in Theorem 1 and Assumption 1(f), choose for some constant . For any given positive constant , let and for some constants and . Then, we have, with probability greater than ,
[TABLE]
where .
Remark 3**.**
Theorem 2 shows the max norm error bound of the debiased LASSO integrated beta estimator. When the beta process satisfies the exact sparsity condition, i.e., , the debiased LASSO integrated beta estimator has the convergence rate of , while we have a slower convergence rate of without a debiasing scheme. The term is the optimal convergence rate of estimating model parameters given observations. For the order term, the usual optimal rate is in high dimensional inferences. However, we have term since the additional term comes from bounding the time-varying processes, such as the target process . In sum, the debiased LASSO integrated beta estimator has the optimal converence rate with up to and orders.
Theorem 2 reveals that the debiased LASSO integrated beta estimator performs better than the integration of the instantaneous beta estimators. Finally, to account for the sparsity structure, we threshold the debiased LASSO integrated beta estimator and obtain the RED-LASSO estimator. Theorem 3 establishes the convergence rate of the RED-LASSO estimator.
Theorem 3**.**
Under the assumptions in Theorem 2, for any given positive constant , choose for some constant . Then, we have, with probability greater than ,
[TABLE]
Theorem 3 shows that the proposed RED-LASSO estimator is consistent in terms of the norm. We note that under the sub-Gaussian assumption on the log-return data, Kim and Shin, (2022) proposed the integrated beta estimator that has the convergence rate of , where , and and are the sparsity related terms for the inverse volatility matrix. Thus, the cost of handling the heavy-tailedness is at most order.
3.3 Discussion on the tuning parameter selection
In this section, we discuss how to choose the tuning parameters to implement the RED-LASSO estimation procedure. We first obtain the variables , , based on the threshold level (3.1). Then, to handle the scale problem, we standardize the variables and , , to have a zero mean and unit variance. The re-scaling is employed after obtaining the RED-LASSO estimator. In the local regression stage (3.2), we select . Also, we choose
[TABLE]
where , , , , and are tuning parameters. For the simulation and empirical studies, we choose , , and that minimize the corresponding mean squared prediction error (MSPE). The results are , , and . Details can be found in Section 5. Also, we select , which minimizes the corresponding Bayesian information criterion (BIC). Finally, we choose that minimizes the following loss function:
[TABLE]
where is the -dimensional identity matrix.
4 A simulation study
To check the finite sample performance of the proposed RED-LASSO estimator, we conducted simulations. Based on the models (2.1)–(2.3), we generated the data using the heavy-tail and sub-Gaussian processes with frequency . Specifically, we employed the following time-series regression jump-diffusion model:
[TABLE]
where the jump sizes and were obtained from 0.1 times i.i.d. -distribution with degrees of freedom , and and were generated by Poisson processes with the intensities and , respectively. We chose as and for the heavy-tailed and sub-Gaussian processes, respectively. The initial values of and were set as zero, and we generated as follows:
[TABLE]
where , , are the i.i.d. -distributions with degrees of freedom , and , , were generated from the following Ornstein-Uhlenbeck process:
[TABLE]
where and is an independent Brownian motion. We note that the process is not realistic. However, to investigate the effect of the heavy-tailedness of the return process, the structure of is imposed. To generate the volatility process , we first generated the Ornstein-Uhlenbeck process as follows:
[TABLE]
where and is an independent Brownian motion. Then, we took as a Cholesky decomposition of , where . To generate the coefficient , we considered the exact sparse process, i.e., for . Specifically, we generated as follows:
[TABLE]
where , , and is a -dimensional independent Brownian motion. For , the initial value and for . The process was taken to be , where is the -dimensional identity matrix and follows the Ornstein-Uhlenbeck process:
[TABLE]
where and is an independent Brownian motion. We chose , , , and we varied from to . When implementing the RED-LASSO estimation procedure, the tuning parameters were selected as discussed in Section 3.3.
To investigate the effect of the robustification of the RED-LASSO estimator, we employed a thrEsholding Debiased LASSO (ED-LASSO) estimator. The ED-LASSO estimator uses the same estimation procedure as the RED-LASSO estimator with . Since the ED-LASSO estimator does not employ the Huber loss and Winsorization method, the jump adjustment for the dependent process is needed. Thus, we used instead of for the ED-LASSO estimator, where
[TABLE]
In the simulation and empirical studies, we choose , where the bipower variation . We note that the ED-LASSO estimator can enjoy the same theoretical properties as the RED-LASSO estimator under the sub-Gaussian process, but it cannot explain the heavy-tailed process. As a benchmark, we also considered the LASSO estimator (Tibshirani,, 1996), which cannot account for any of the heavy-tailed distribution or the time-varying beta process. Specifically, we employed the LASSO estimator as follows:
[TABLE]
where the regularization parameter was selected by minimizing the corresponding Bayesian information criterion (BIC). The average estimation errors under the max norm, norm, and norm were computed by 1000 simulations.
Figure 1 plots the log max, , and norm errors of the RED-LASSO, ED-LASSO, and LASSO estimators with for the heavy-tail and sub-Gaussian processes. From Figure 1, we can find that the estimation errors of the RED-LASSO estimator decrease as the sample size increases. As expected, the RED-LASSO estimator performed the best for the heavy-tail process. This may be because the RED-LASSO estimator can explain the heavy-tailedness while other estimators cannot. For the sub-Gaussian process, the RED-LASSO and ED-LASSO estimators showed better performance than the LASSO estimator. This is because the LASSO estimator cannot account for the time variation of the beta process. We note that, even for the sub-Gaussian process, the RED-LASSO estimator showed better performance than the ED-LASSO estimator. One possible explanation for this is that the true return process can have some extreme values over time even if the sub-Gaussian random variables are used. From this result, we can conjecture that the RED-LASSO estimator is robust to the heavy-tailedness of the log-return process.
5 An empirical study
In this section, we applied the proposed RED-LASSO estimator to high-frequency trading data from January 2013 to December 2019. We took stock price data, futures price data, and firm fundamentals from the End of Day website, FirstRate Data website, and Center for Research in Security Prices (CRSP)/Compustat Merged Database, respectively. We obtained 5-min log-price data with the previous tick scheme (Wang and Zou,, 2010; Zhang,, 2011) and processed the data similar to the procedure in Kim and Shin, (2022). The days with half trading hours were not included. For the dependent process, we collected the log-price data of the following five assets: Apple Inc. (AAPL), Berkshire Hathaway Inc. (BRK.B), General Motors Company (GM), Alphabet Inc. (GOOG), and Exxon Mobil Corporation (XOM). These firms have the top market values in their global industry classification standard (GICS) sectors. For the covariate process, we first obtained the log-prices of 54 futures, which are often used as the market macro variables. For example, we selected 20 commodity data, 10 currency data, 10 interest rate data, and 14 stock market index data. The specific list is presented in Table 4 in the Appendix. Then, we constructed Fama-French five factors (Fama and French,, 2015) and the momentum factor (Carhart,, 1997) with the assets listed on NYSE, NASDAQ, and AMEX, which are widely used in the stock market analysis. We note that the MKT, HML, SMB, RMW, CMA, and MOM represent the market, value, size, profitability, investment, and momentum factors, respectively. First, we calculated MKT as the return of a value-weighted portfolio of whole assets. Then, we obtained other factors as follows:
[TABLE]
where small (S) and big (B) portfolios represent the small and big market equities, respectively, while we classified high (H), medium (M), and low (L) portfolios according to their ratio of book equity to market equity. On the other hand, robust (R), neutral (N), and weak (W) portfolios were classified by their profitability, while we obtained conservative (C), neutral (N), and aggressive (A) portfolios using their investment data. Also, up (U), flat (F), and down (D) portfolios were classified by their momentum of the return. The portfolio constituents were updated monthly, and, with 5-min frequency, we obtained the portfolio return as follows:
[TABLE]
where is the portfolio return for the th day and th time interval, is the number of portfolio components on the th day, the superscript is used to represent the th stock of the portfolio, and is calculated by
[TABLE]
where is the market capitalization of the th stock at the market close time on the day , and represents the overnight return from the day to day . To sum up, the five assets and 60 factors were used for the dependent and covariate processes, respectively. The details of the data processing can be found in Aït-Sahalia et al., (2020) and Kim and Shin, (2022).
To determine the tuning parameters , , and , we employed the mean squared prediction error (MSPE) with the data in . For the choice of , we defined
[TABLE]
where is the instantaneous beta estimator at time with the tuning parameter for the th month in and th stock. Then, we selected by minimizing over Based on the selected , we defined
[TABLE]
where is the debiased instantaneous beta estimator at time with the tuning parameter for the th month in and th stock. We chose which minimizes over . Finally, with the selected and , we defined
[TABLE]
where is the RED-LASSO estimator with the tuning parameter and is the debiased integrated beta estimator for the th month in and th stock. Then, we selected by minimizing over . The results are , , and . We note that the stationarity assumption for the beta process is reasonable, which motivates and justifies the above tuning parameter selection procedure. Then, using the RED-LASSO, ED-LASSO, and LASSO estimation procedures, we obtained the monthly integrated betas for each of the five assets. The tuning parameters were selected based on Section 3.3 and Section 4. For the non-trading period, we set the beta estimates as zero.
We first compare the performances of the RED-LASSO, ED-LASSO, and LASSO estimators. To do this, we calculated the monthly in-sample and out-of-sample with the monthly integrated beta estimates. The out-of-sample was calculated using the integrated betas from the previous month, and it was obtained excluding the year 2013 since the tuning parameters were chosen based on the data in 2013. For each year, we calculated the average across the five assets and twelve months. Table 1 shows the average in-sample and out-of-sample of the RED-LASSO, ED-LASSO, and LASSO estimators. As seen in Table 1, the RED-LASSO estimator shows the best performance for all periods. This may be because only the RED-LASSO estimator can handle both the heavy-tailed distribution of the return process and time-varying property of the beta process.
Table 2 shows the non-zero frequency of the RED-LASSO, ED-LASSO, and LASSO estimators for the five assets and 60 factors over 84 months. Table 3 shows the monthly average of non-zero frequency over factors and time for the RED-LASSO, ED-LASSO, and LASSO estimators for the five assets. As seen in Tables 2 and 3, the RED-LASSO estimator can better account for the sparsity of the integrated betas than the ED-LASSO and LASSO estimators. From this result, we can conjecture that the proposed RED-LASSO provides more sparse beta estimates, which is the important property in practice. Furthermore, as discussed above, the RED-LASSO estimator shows the best performance in terms of in Table 1. That is, the RED-LASSO estimator can explain the market dynamics well with a simpler model. We note that for the RED-LASSO estimates, the stock market index futures factors had non-zero integrated betas more often than the other futures factors. This result is consistent with the multi-factor models (Asness et al.,, 2013; Carhart,, 1997; Fama and French,, 1992, 2015) since the market factors can be partially explained by the stock market index futures factors.
Now, we investigate the result of the RED-LASSO estimator. Figure 2 shows the monthly integrated betas from the RED-LASSO estimator for the five assets and 60 factors. Figure 3 depicts the non-zero frequency of the RED-LASSO estimator for the five groups, consisting of the commodity futures group, currency futures group, interest rate futures group, stock market index futures group, and market factor group. From Figures 2 and 3, we see that integrated betas change over time, and only a small number of factors had non-zero integrated betas in most periods. To investigate time-series of the significant betas, we plotted the integrated beta estimates for the three factors that most frequently had non-zero integrated betas in Figure 4. The AAPL has NQ (E-mini Nasdaq 100), ES (E-mini S&P 500), and YM (E-mini Dow); BRK.B has MKT, ES, and YM; GM has MKT, MOM, and RMW; GOOG has NQ, ES, and MME (MSCI Emerging Markets Index); and XOM has MKT, XAE (E-mini Energy Select Sector), and MOM. In sum, either the NQ factor or MKT factor most frequently had non-zero integrated betas, while the other factors had non-zero integrated betas only for some time periods.
When modeling regression-based financial models, we often employ the six factors, MKT, HML, SMB, RMW, CMA, and MOM (Asness et al.,, 2013; Carhart,, 1997; Fama and French,, 2015, 2016). To investigate their beta behaviors in more detail, we plotted the integrated betas with the RED-LASSO and ED-LASSO estimators for these six factors in Figure 5. As expected, the MKT factor played a significant role for BRK.B, GM, and XOM; however, the six factors had zero integrated betas in most periods for the AAPL and GOOG. This may be because technology companies, such as AAPL and GOOG, have recently shown outstanding performance in the U.S. market; thus, the NQ (E-mini Nasdaq-100) factor can explain their movements well, as shown in Figure 4. We note that the results of the two other estimators are similar, but the RED-LASSO estimator has the more stable result. Thus, we can conjecture that considering both heavy-tailed distribution and time variation of beta process helps better explain the beta dynamics.
6 Conclusion
In this paper, we developed a novel RED-LASSO estimation procedure that can handle the heavy-tailedness of financial data and account for the time variation and sparsity of the high-dimensional beta process. To estimate the instantaneous beta, we propose a robust estimator that employs the Huber loss, truncation method, and -penalty. We demonstrated that the proposed instantaneous beta estimator can handle the heavy-tailedness and the curse of dimensionality with a desirable convergence rate. To handle the heavy-tailed bias coming from the Huber loss and -penalty, we developed a robust debiasing scheme and propose an integrated beta estimator. We showed that the proposed debiasing method sufficiently mitigates the effect of the bias, and the integrated beta estimator can enjoy the law of large number property. Then, the debiased integrated beta estimator is further regularized to account for the sparsity of the integrated beta. We demonstrated that the proposed RED-LASSO estimator can achieve the near-optimal convergence rate.
In the empirical study, the RED-LASSO estimation procedure shows the best performance in terms of and the sparsity of the beta estimates. It suggests that when estimating integrated beta in the high-dimensional high-frequency set-up, the RED-LASSO estimation method helps account for the features of the time-varying beta process and heavy-tailed distributions of observed log-returns. On the other hand, we did not consider microstructure noises. The microstructure noise could be another source of the heavy tails and accommodating them leads to an application for higher frequency observations. However, if we impose the microstructure noise structure on the regression diffusion model, we have an unbalanced order relationship between the noise and regression variables, which ruins the usual regression structure. Hence, it is difficult to apply the existing estimation methods. It would be interesting and important to develop a robust estimation method that can handle microstructure noises. We leave this issue for a future study.
Funding
This work was supported by the National Research Foundation of Korea [2021R1C1C1003216].
Appendix A Appendix
A.1 Proof of Theorem 1
Without loss of generality, it is enough to show the statement for fixed . For simplicity, we denote by .
Proposition 1**.**
Under the assumptions in Theorem 1, we have
[TABLE]
with probability greater than for any given positive constant .
Proof of Proposition 1. Define
[TABLE]
We have
[TABLE]
Thus, for , we have
[TABLE]
where
[TABLE]
[TABLE]
First, we consider . By the boundedness condition Assumption 1(b), we can show, with probability at least ,
[TABLE]
for some positive constant . Then, we have
[TABLE]
For , by (A.3), we have
[TABLE]
For , let . Then, we have
[TABLE]
Consider the first term. Similar to the proofs of Theorem 1 (Kim and Shin,, 2022), we can show, for any constant ,
[TABLE]
Then, by the Cauchy–Schwarz inequality, we have, with probability at least ,
[TABLE]
Also, for , we have
[TABLE]
Thus, we have, with probability at least ,
[TABLE]
Consider the second term. By (A.10) and Hölder’s inequality, we have, with probability at least ,
[TABLE]
Then, using the fact that
[TABLE]
we have, with probability at least ,
[TABLE]
where the second inequality is due to the Hölder’s inequality and the last inequality is from (A.10) and (A.14). By (A.12) and (A.15), we have
[TABLE]
For , by (2.1) in Freedman, (1975), we have
[TABLE]
Also, by (A.10) and (A.14), we have, with probability at least ,
[TABLE]
where the second inequality is due to the Hölder’s inequality. Thus, we have
[TABLE]
By (A.7), (A.9), (A.19), and (A.22), we have, with probability at least ,
[TABLE]
Consider . For some large constant , define
[TABLE]
By (A.3), we have
[TABLE]
By the boundedness of the intensity process, we have
[TABLE]
Under the event , we have, for large ,
[TABLE]
Thus, we have
[TABLE]
We note that, for any ,
[TABLE]
Hence, under the event , we have
[TABLE]
which implies
[TABLE]
Combining (A.2), (A.23), and (A.25), we have, with probability greater than ,
[TABLE]
Proof of Theorem 1. By Proposition 1, it is enough to show the statement under (A.1). First, we investigate . Since
[TABLE]
we have
[TABLE]
Then, we have
[TABLE]
Thus, we have
[TABLE]
where is defined in Assumption 1(e).
Now, we investigate and . By (2.5), we have
[TABLE]
Thus, by (A.27)–(A.28), we have
[TABLE]
where the second inequality is due to the Cauchy–Schwarz inequality. Suppose that
[TABLE]
Then, we have
[TABLE]
From the optimality of and the integral form of the Taylor expansion, we have
[TABLE]
For the first and second terms, we have
[TABLE]
where the second inequality is due to (A.32). For the last term, let
[TABLE]
Then, for any , we have
[TABLE]
where the last inequality is due to (A.32). Thus, by Assumption 1(e), we have
[TABLE]
Combining (A.33)–(A.39), we have
[TABLE]
which implies
[TABLE]
This contradicts to (A.31), thus, we obtain the norm error bound. Then, by (A.29), we can show the norm error bound.
A.2 Proof of Theorem 2
Proof of Theorem 2. We first investigate and . By (2.5), (3.7), and Assumption 1(c), we can show, with probability at least ,
[TABLE]
For , similar to the proofs of Theorem 1 (Kim and Shin,, 2022), we can show, with probability at least ,
[TABLE]
Thus, we have, with probability at least ,
[TABLE]
Consider . For each , there exists standard Brownian motion such that
[TABLE]
Then, by the proofs of Theorem 1 (Kim and Shin,, 2022), we have
[TABLE]
where
[TABLE]
Note that
[TABLE]
Hence, similar to the proofs of (A.25), we can show
[TABLE]
where
[TABLE]
Let
[TABLE]
where
[TABLE]
Then, we have
[TABLE]
Consider . By the boundedness of the intensity, we can show . Thus, we have
[TABLE]
For , by (A.45)–(A.46), we have, with probability at least ,
[TABLE]
Consider . Similar to the proofs of (A.20) in Kim and Shin, (2022), we can show, with probability at least ,
[TABLE]
Consider . By Assumption 1(b) and (f), we can show, with probability at least ,
[TABLE]
Thus, by Assumption 1(f), we can show, with probability at least ,
[TABLE]
Then, by (A.24), (A.42), and (A.45), we have, with probability at least ,
[TABLE]
For , let . We have
[TABLE]
For the first term, by the boundedness of the intensity process and (A.45), we can show, with probability at least ,
[TABLE]
Thus, from (A.42), we have, with probability at least ,
[TABLE]
Then, by (2.1) in Freedman, (1975), we have, for ,
[TABLE]
which implies
[TABLE]
For the second term, we have, with probability at least ,
[TABLE]
Thus, we have
[TABLE]
Consider . By the sub-Gaussianity of the beta process, we can show, with probability at least ,
[TABLE]
For , by Assumption 1(b), we have
[TABLE]
Combining (A.53)–(A.69), we have, with probability greater than ,
[TABLE]
A.3 Proof of Theorem 3
Proof of Theorem 3. By (3.8), there exists such that, with probability greater than ,
[TABLE]
Thus, it is enough to show the statement under the event . Similar to the proofs of Theorem 1 (Kim and Shin,, 2022), we can show
[TABLE]
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Aït-Sahalia et al., (2020) Aït-Sahalia, Y., Kalnina, I., and Xiu, D. (2020). High-frequency factor models and regressions. Journal of Econometrics , 216(1):86–105.
- 2Aït-Sahalia and Xiu, (2019) Aït-Sahalia, Y. and Xiu, D. (2019). Principal component analysis of high-frequency data. Journal of the American Statistical Association , 114(525):287–303.
- 3Andersen et al., (2006) Andersen, T. G., Bollerslev, T., Diebold, F. X., and Wu, G. (2006). Realized beta: Persistence and predictability . Emerald Group Publishing Limited.
- 4Asness et al., (2013) Asness, C. S., Moskowitz, T. J., and Pedersen, L. H. (2013). Value and momentum everywhere. The Journal of Finance , 68(3):929–985.
- 5Bali et al., (2011) Bali, T. G., Cakici, N., and Whitelaw, R. F. (2011). Maxing out: Stocks as lotteries and the cross-section of expected returns. Journal of financial economics , 99(2):427–446.
- 6Barndorff-Nielsen and Shephard, (2004) Barndorff-Nielsen, O. E. and Shephard, N. (2004). Econometric analysis of realized covariation: High frequency based covariance, regression, and correlation in financial economics. Econometrica , 72(3):885–925.
- 7Cai et al., (2011) Cai, T., Liu, W., and Luo, X. (2011). A constrained ℓ 1 subscript ℓ 1 \ell_{1} minimization approach to sparse precision matrix estimation. Journal of the American Statistical Association , 106(494):594–607.
- 8Campbell et al., (2008) Campbell, J. Y., Hilscher, J., and Szilagyi, J. (2008). In search of distress risk. The Journal of Finance , 63(6):2899–2939.
