Volatility Analysis with Realized GARCH-Ito Models
Xinyu Song, Donggyu Kim, Huiling Yuan, Xiangyu Cui, Zhiping Lu, Yong, Zhou, Yazhen Wang

TL;DR
This paper develops a unified realized GARCH-Ito model for high-frequency financial data, capturing both continuous and jump components, and proposes estimation methods validated through simulations and empirical analysis.
Contribution
It introduces the realized GARCH-Ito model embedding discrete realized GARCH in continuous volatility, with new estimation techniques and empirical validation.
Findings
Model effectively captures volatility dynamics with jumps.
Proposed estimation methods show good finite sample performance.
Empirical application demonstrates practical usefulness.
Abstract
This paper introduces a unified approach for modeling high-frequency financial data that can accommodate both the continuous-time jump-diffusion and discrete-time realized GARCH model by embedding the discrete realized GARCH structure in the continuous instantaneous volatility process. The key feature of the proposed model is that the corresponding conditional daily integrated volatility adopts an autoregressive structure where both integrated volatility and jump variation serve as innovations. We name it as the realized GARCH-Ito model. Given the autoregressive structure in the conditional daily integrated volatility, we propose a quasi-likelihood function for parameter estimation and establish its asymptotic properties. To improve the parameter estimation, we propose a joint quasi-likelihood function that is built on the marriage of daily integrated volatility estimated by…
| MSE | ||||||||||
| \ | ||||||||||
| 457.691 | 329.315 | 180.923 | 1.868 | |||||||
| 456.705 | 327.808 | 177.606 | 1.609 | |||||||
| 453.558 | 327.112 | 176.499 | 1.480 | |||||||
| 450.895 | 325.991 | 175.006 | 1.227 | |||||||
| MSE | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| QMLE-HL | QMLE-HLO | |||||||||||||
| 125 | 390 | 22.514 | 83.956 | 401.751 | 80.986 | 6.349 | 70.967 | 220.209 | 73.972 | 13.829 | 7.301 | 2.707 | ||
| 780 | 13.874 | 59.539 | 263.366 | 64.944 | 1.946 | 52.000 | 64.073 | 55.349 | 5.973 | 5.622 | 1.456 | |||
| 2340 | 12.759 | 27.847 | 154.776 | 32.416 | 1.549 | 21.814 | 55.391 | 27.515 | 3.915 | 5.341 | 0.896 | |||
| 23400 | 11.414 | 9.172 | 100.052 | 12.784 | 1.430 | 2.500 | 36.110 | 2.417 | 1.801 | 2.574 | 0.057 | |||
| 250 | 390 | 9.197 | 76.620 | 266.625 | 75.162 | 3.862 | 69.965 | 169.024 | 65.612 | 11.865 | 3.784 | 2.018 | ||
| 780 | 4.645 | 50.045 | 146.061 | 58.639 | 1.106 | 46.422 | 34.844 | 50.483 | 4.224 | 3.189 | 1.426 | |||
| 2340 | 3.604 | 20.791 | 73.631 | 25.116 | 0.850 | 19.384 | 27.946 | 20.633 | 2.154 | 2.947 | 0.723 | |||
| 23400 | 3.089 | 4.571 | 47.478 | 5.838 | 0.762 | 1.356 | 16.774 | 1.209 | 1.135 | 1.557 | 0.029 | |||
| 500 | 390 | 4.552 | 71.620 | 187.886 | 69.883 | 2.633 | 65.360 | 140.817 | 60.363 | 10.524 | 2.300 | 1.895 | ||
| 780 | 1.767 | 46.471 | 71.798 | 53.530 | 0.561 | 42.864 | 18.012 | 45.275 | 2.939 | 1.983 | 1.357 | |||
| 2340 | 1.232 | 17.835 | 42.019 | 18.183 | 0.421 | 13.502 | 16.107 | 15.762 | 1.288 | 1.873 | 0.597 | |||
| 23400 | 1.108 | 2.127 | 24.276 | 2.645 | 0.390 | 0.718 | 8.779 | 0.609 | 0.611 | 0.841 | 0.014 | |||
| 1000 | 390 | 2.544 | 69.202 | 139.960 | 60.467 | 1.808 | 60.128 | 126.474 | 52.889 | 9.530 | 1.694 | 1.646 | ||
| 780 | 0.706 | 44.901 | 34.083 | 44.476 | 0.293 | 38.988 | 8.461 | 36.868 | 1.942 | 1.569 | 1.174 | |||
| 2340 | 0.522 | 16.317 | 23.354 | 13.971 | 0.271 | 10.613 | 7.610 | 8.862 | 0.855 | 1.222 | 0.500 | |||
| 23400 | 0.454 | 1.087 | 13.779 | 1.301 | 0.247 | 0.366 | 4.518 | 0.306 | 0.325 | 0.436 | 0.007 | |||
| MSPE | |||||
|---|---|---|---|---|---|
| Realized GARCH-Itô | Unified GARCH-Itô | Jump-adjusted | |||
| QMLE-HL | QMLE-HLO | QMLE-HL | MSRV | ||
| 125 | 390 | 4.017 | 3.303 | 7.560 | 7.869 |
| 780 | 2.119 | 1.839 | 7.570 | 5.287 | |
| 2340 | 1.296 | 1.141 | 8.284 | 3.229 | |
| 23400 | 0.578 | 0.459 | 8.806 | 1.205 | |
| 250 | 390 | 3.819 | 3.240 | 7.957 | 7.959 |
| 780 | 1.990 | 1.715 | 8.088 | 5.346 | |
| 2340 | 1.206 | 1.035 | 9.182 | 3.284 | |
| 23400 | 0.500 | 0.438 | 9.593 | 1.231 | |
| 500 | 390 | 3.657 | 3.101 | 8.127 | 8.004 |
| 780 | 1.860 | 1.657 | 8.138 | 5.478 | |
| 2340 | 1.007 | 0.911 | 8.483 | 3.286 | |
| 23400 | 0.438 | 0.396 | 9.664 | 1.202 | |
| 1000 | 390 | 3.501 | 2.998 | 8.052 | 7.963 |
| 780 | 1.775 | 1.601 | 8.378 | 5.403 | |
| 2340 | 0.903 | 0.852 | 8.474 | 3.165 | |
| 23400 | 0.401 | 0.389 | 9.141 | 1.235 | |
| MSPE | ||||
| Forecast Origin | Realized GARCH-Itô | Unified GARCH-Itô | Jump-adjusted | |
| QMLE-HL | QMLE-HLO | QMLE-HL | MSRV | |
| 2.527 | 2.323 | 3.141 | 2.655 | |
| 3.024 | 2.770 | 3.744 | 3.177 | |
| 3.851 | 3.510 | 4.766 | 4.040 | |
| 5.005 | 4.536 | 6.189 | 5.251 | |
| 4.052 | 3.913 | 6.813 | 4.134 | |
| 6.628 | 5.073 | 12.559 | 6.578 | |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFinancial Risk and Volatility Modeling · Complex Systems and Time Series Analysis · Stochastic processes and financial applications
\floatsetup
[table]capposition=top
Volatility Analysis with Realized GARCH-Itô Models
Xinyu Song1, Donggyu Kim2, Huiling Yuan3, Xiangyu Cui1,
Zhiping Lu4, Yong Zhou4, Yazhen Wang4&5***Corresponding author: Yazhen Wang. Address: 1175 Medical Science Center, 1300 University Avenue , Madison, WI 53706. Phone: 6082626399. Fax: 6082620032. E-mail: [email protected].
1 Shanghai University of Finance and Economics
2 Korea Advanced Institute of Science and Technology (KAIST)
3 City University of Hong Kong
4 East China Normal University
5 University of Wisconsin-Madison
Abstract
This paper introduces a unified approach for modeling high-frequency financial data that can accommodate both the continuous-time jump-diffusion and discrete-time realized GARCH model by embedding the discrete realized GARCH structure in the continuous instantaneous volatility process. The key feature of the proposed model is that the corresponding conditional daily integrated volatility adopts an autoregressive structure, where both integrated volatility and jump variation serve as innovations. We name it as the realized GARCH-Itô model. Given the autoregressive structure in the conditional daily integrated volatility, we propose a quasi-likelihood function for parameter estimation and establish its asymptotic properties. To improve the parameter estimation, we propose a joint quasi-likelihood function that is built on the marriage of daily integrated volatility estimated by high-frequency data and nonparametric volatility estimator obtained from option data. We conduct a simulation study to check the finite sample performance of the proposed methodologies and an empirical study with the S&P500 stock index and option data.
JEL classification: C10, C22, C58
Keywords: High-frequency financial data, option data, quasi-maximum likelihood estimation, stochastic differential equation, volatility estimation and prediction.
1 Introduction
In modern financial markets, volatility measures the degree of dispersion for assets and plays a crucial role in portfolio allocation, performance evaluation, and risk management. Low-frequency and high-frequency stock data are widely adopted to model the dynamic evolution of daily volatilities. Option data provide one more natural source for the more precise forecast of volatilities and have been investigated thoroughly since the seminal work of Black and Scholes, (1973). In traditional volatility analysis, researchers employ discrete parametric econometric models and low-frequency data. Examples include the generalized autoregressive conditional heteroskedasticity (GARCH) models (Bollerslev,, 1986; Engle,, 1982) which adopt squared daily log returns as innovations in the conditional volatilities. However, when the volatility changes rapidly to a new level, it is often difficult to catch up with the new level immediately using only the daily log returns as the innovations (Andersen et al.,, 2003). On the other hand, high-frequency financial data that refer to intra-daily observations such as tick-by-tick stock prices became available thanks to advances in information technology. Major challenges in estimating volatilities with high-frequency data are the market microstructure noises and price jumps. Without the presence of price jumps, Zhang et al., (2005) proposed two-time scale realized volatility (TSRV) which is a consistent estimator for daily variation while Zhang, (2006) further improved the TSRV to multi-scale realized volatility (MSRV) so that it can achieve the optimal convergence rate. Other forms of estimators that can achieve the optimal convergence rate only in the presence of market microstructure noises are kernel realized volatility (KRV) (Barndorff-Nielsen et al.,, 2008), quasi-maximum likelihood estimator (QMLE) (Aït-Sahalia et al.,, 2010; Xiu,, 2010), pre-averaging realized volatility (PRV) (Jacod et al.,, 2009), and robust pre-averaging realized volatility (Fan and Kim,, 2018). Empirical studies support the existence of price jumps, and decomposition of daily variation into its continuous and jump components can improve volatility forecasts (Aït-Sahalia et al.,, 2012; Andersen et al.,, 2007; Barndorff-Nielsen and Shephard,, 2006; Corsi et al.,, 2010). For example, Mancini, (2004) studied a threshold method for jump-detection and presented the order of an optimal threshold, and Davies and Tauchen, (2018) further examined a data-driven type threshold method. Also Fan and Wang, (2007) and Zhang et al., (2016) employed wavelet method to identify the jumps given noisy high-frequency data. We refer to the estimators of daily variation based on high-frequency data as the realized volatility estimators. Such estimators are more informative compared to simple squared daily log returns as the innovations, which may help to catch up with rapid changes in the volatility process better.
Efforts made for volatility estimation usually employ low- and high-frequency data independently. However, the inter-correlation between low- and high-frequency data gathered at the two different time scales cannot be ignored as low-frequency data present high-frequency data in an aggregated form. There are several attempts to bridge the gap between the two types of data. For example, multiple studies proposed new GARCH type models, which include realized volatilities as innovations in the conditional volatilities (Engle and Gallo,, 2006; Shephard and Sheppard,, 2010; Hansen et al.,, 2012). On the other hand, Wang, (2002) showed that the standard GARCH model and its diffusion limit are nonequivalent asymptotically, which discredits the direct application of statistical inferences derived for the GARCH model to its diffusion limit. Thus, Kim and Wang, (2016) introduced the unified GARCH-Itô model by embedding the standard GARCH volatility structure in the instantaneous volatilities of an Itô diffusion process. The unified GARCH-Itô model is a continuous-time process at the high-frequency timescale and when restricted to the low-frequency timescale, retains the standard GARCH structure.
In this paper, we expand the unified GARCH-Itô model (Kim and Wang,, 2016) so that features of financial data at both frequencies can be better captured as follows. First, price jumps that are well-documented in empirical studies are allowed, and we incorporate squared price jumps into the volatility dynamics by a structure similar to the ones introduced in the COGARCH model (Klüppelberg et al.,, 2004) and the jump-driven volatility model (Todorov,, 2011). Second, we embed the realized GARCH volatility structure (Hansen et al.,, 2012) in the instantaneous volatilities of a jump-diffusion process, which employs the more informative high-frequency data-based innovations. Third, the well-known intra-day U-shape volatility pattern is accounted for (Admati and Pfleiderer,, 1988; Andersen et al.,, 1997, 2019; Hong and Wang,, 2000). We name the proposed model as the realized GARCH-Itô model. The key feature of the proposed model is that its conditional volatility has integrated volatility and jump variation as innovations. Based on the structure of the conditional volatility process, we propose a quasi-likelihood function for estimating model parameters. Specifically, the quasi-likelihood function that is usually adopted in the standard GARCH type models is employed, and the realized volatility estimators are used as the proxy for conditional volatilities. We call the proposed estimator the quasi-maximum likelihood estimator based on high-frequency data and low-frequency structure (QMLE-HL). The proposed model and this estimating approach are constructed purely based on stock data. We as well harness option data to improve the model parameter estimation. In specific, Todorov, (2019) developed nonparametric volatility estimator based on a portfolio of short-dated option contracts given a general setting where jumps are present. As stated in Todorov, (2019), the estimator can be viewed as the option counterpart of high-frequency data-based volatility estimators. To incorporate the option-based nonparametric volatility estimator, we construct a joint quasi-likelihood function. We call the proposed estimator the quasi-maximum likelihood estimator based on high-frequency data, low-frequency structure and additional option data (QMLE-HLO). Both the QMLE-HL and the QMLE-HLO present good consistency and asymptotic properties. In numerical analysis, we further demonstrate that the joint estimation method QMLE-HLO performs better in estimation and prediction than the QMLE-HL.
This paper is organized as follows. Section 2 introduces the realized GARCH-Itô model. We demonstrate its connection with the realized GARCH model and discuss its advantages comparing to the unified GARCH-Itô model. Section 3 introduces quasi-likelihood estimation methods and investigates their asymptotic behaviors. Section 4 conducts a simulation study to check the finite sample performance for the proposed estimators. Section 5 carries out an empirical analysis with S&P500 stock and option data to demonstrate the advantage of the proposed model in volatility analysis. We collect all the proofs in the Appendix.
2 Realized GARCH-Itô model
The realized GARCH-Itô model is an innovated jump-diffusion process that can incorporate high-frequency based volatility model (Shephard and Sheppard,, 2010) and realized GARCH model (Hansen et al.,, 2012) structures. Let and be the set of all non-negative integers. Our proposed model is formulated as follows.
Definition 1**.**
Log stock price , , obeys a realized GARCH-Itô model if it satisfies
[TABLE]
[TABLE]
where denotes the ceiling of , , and are standard Brownian motions with respect to filtration with a.s., is a predictable process that is known as the drift, and is the volatility process that is adapted to . For the jump part, is the standard Poisson process with constant intensity and denotes the i.i.d. jump sizes which are independent of the Poisson and continuous diffusion processes.
Remark 1**.**
The i.i.d. assumption on jump sizes can be rewritten as
[TABLE]
where ’s are i.i.d. random variables with mean zero and variance , is restricted to be positive. For instance, if the jump sizes ’s obey the Normal distribution with mean and variance , then the corresponding takes value while has mean zero and variance .
The instantaneous volatility in (2.3) is defined at all times for and also retains some U-shape pattern within the intra-day. Specifically, when considering the deterministic process part of the instantaneous volatility, it is convex with respect to time and for an appropriate parameter, it has the smallest value in the middle section of the day. This U-shape instantaneous volatility pattern is often observed in empirical data and supported by financial market (Admati and Pfleiderer,, 1988; Andersen et al.,, 1997, 2019; Hong and Wang,, 2000). Moreover, random fluctuations are accounted for in the instantaneous volatility process. We note that when the process is restricted to integer times, it employs the realized GARCH model type structure (Hansen et al.,, 2012) with an additional jump innovation term as follows:
[TABLE]
where and . Therefore, the instantaneous volatility process is affected by both the integrated volatilities and the jump variations of the stock price process. In comparison to the unified GARCH-Itô model (Kim and Wang,, 2016), the realized GARCH-Itô model considers price jumps, accounts for intra-day U-shape volatility pattern, and adopts a richer volatility dynamics with random fluctuations.
For statistical inferences, we study the integrated volatilities obtained from the realized GARCH-Itô model over consecutive integers, that is, .
Proposition 1**.**
Iterative relationship exists in integrated volatilities for the realized GARCH-Itô model defined in Definition 1 and when condition (2.4) is met.
- (a)
For and , the realized GARCH-Itô model implies that
[TABLE]
where
[TABLE]
[TABLE]
and
[TABLE]
are all martingale differences. 2. (b)
For and ,
[TABLE]
where is defined in (2.7). 3. (c)
For and ,
[TABLE]
where , and are defined in (2.8).
Proposition 1 (a) indicates that the daily integrated volatility can be decomposed into the realized GARCH volatility and the martingale difference , where the GARCH volatility can be further explained by historical integrated volatilities and jump variations. We utilize this model feature to build up parameter estimation methods. Moreover, this paper uses the integrated volatilities as proxy to develop an estimation procedure for the GARCH parameter in Section 3. This is because without the spot volatility estimation, we cannot distinguish the interceptor parameters , , and .
3 Parameter estimation
In this section, we first discuss the model set-up and review nonparametric estimation methods for the integrated volatility in the presence of market microstructure noises given the jump-diffusion process. With the well-performing realized volatility and jump variation estimators, we construct quasi-maximum likelihood estimation procedures and investigate their asymptotic behaviors.
3.1 The model set-up and realized volatility estimators
Let be the total number of low-frequency observations and be the total number of high-frequency observations during the th low-frequency period, for example, the th day. We further denote . The underlying log price process is assumed to obey the realized GARCH-Itô model as described in Definition 1. The low-frequency data are the true log prices at integer times, . The high-frequency data are observations between integer times and are contaminated by market microstructure noises. Major sources for the market microstructure noises are bid-ask bounce, discreteness of price change, and infrequent trading that only play a role in high-frequency trading (Ait-Sahalia and Yu,, 2009). We let be the high-frequency observed time points during the th low-frequency period such that . In this regard, we take the well-agreed assumption in high-frequency literature such that
[TABLE]
where ’s are market microstructure noises that are some stationary random variables with . Moreover, we note that the effect of the drift term on high-frequency data based volatility estimators is negligible asymptotically, so we take to highlight on modeling the volatility and jump processes.
Without the presence of price jumps, researchers have constructed nonparametric realized volatility estimators that take advantage of sub-sampling and local-averaging techniques to remove the effect of market microstructure noises so that the integrated volatility can be estimated consistently and efficiently. Such estimators include the multi-scale realized volatility estimator (Zhang,, 2006, 2011), the pre-averaging realized volatility estimator (Christensen et al.,, 2010; Jacod et al.,, 2009), and the kernel realized volatility estimator (Barndorff-Nielsen et al.,, 2008). To identify the jump locations given noisy high-frequency data, Fan and Wang, (2007) and Zhang et al., (2016) proposed wavelet methods to detect jumps and applied the MSRV method to jump-adjusted data. They demonstrated that the estimator of jump variation has the convergence rate of , which further helps the estimator of integrated volatility to achieve the optimal convergence rate of . In this paper, we let to be the estimator of jump variation for the th day and to be the corresponding estimator of daily integrated volatility that is robust to microstructure noises and price jumps, where both estimators can achieve the convergence rate .
3.2 Quasi-maximum likelihood estimation based on high-frequency data and low-frequency structure
3.2.1 Estimation procedure
Recall that the integrated volatility over the th period can be decomposed into the realized GARCH volatility and martingale difference as described in Proposition 1 (a). We harness this information for making inferences on the true parameter . Specifically, using the likelihood of the standard GARCH model and the low-frequency structure of the realized GARCH-Itô model, we define the following quasi-likelihood function
[TABLE]
Under some technical conditions, the impact of the martingale difference term is negligible in the asymptotic sense. Therefore, the realized volatility estimators ’s based on data from (3.1) can be considered as the observed value for ’s and are employed as the proxy. To harness the proposed quasi-likelihood function (3.2), we first need to evaluate the realized GARCH term . Recall the iterative relationship in the realized GARCH term as described in Proposition 1 (a):
[TABLE]
The initial is selected to be that is given in Proposition 1 (c). Specifically, we take
[TABLE]
The true integrated volatilities and jump variations are not observed so that we adopt their estimators and , respectively. Specifically, let
[TABLE]
With the realized GARCH volatility estimator in (3.3), the quasi-likelihood function (3.2) is updated to the following:
[TABLE]
We estimate the true parameter by maximizing the quasi-likelihood function in (3.4),
[TABLE]
and call the maximizer in (3.5) the quasi-maximum likelihood estimator based on high-frequency data and low-frequency structure combined (QMLE-HL).
3.2.2 Asymptotic theory
This section establishes the consistency and asymptotic distribution for the proposed estimator . We first define some notations. For any given random variable and , define . For a matrix , let . Let ’s be positive generic constants whose values are free of , , and , and may change from occurrence to occurrence. To investigate the asymptotic behaviors of proposed estimation method, we require the following technical assumptions.
Assumption 1**.**
- (a)
Let
[TABLE]
where are known positive constants. 2. (b)
We have and . 3. (c)
There exist some fixed constants and such that , and and as . 4. (d)
One of the following conditions is satisfied.
- (d1)
There exists a positive constant such that for any , where .
- (d2)
* a.s. for any .* 5. (e)
* and .* 6. (f)
For any , a.s. 7. (g)
* is a stationary ergodic process.*
Remark 2**.**
The parameters of interests are related to volatilities (the 2nd moment), thus, to study their asymptotic behaviors, we require some finite 4th moment conditions such as Assumption 1 (b) and (d). Therefore, these conditions are not restrictive at all. Assumption 1 (c) is a well-known key condition in high-frequency data based volatility analysis. Under the finite 4th moment condition, Kim et al., (2016) showed that the realized volatility estimators satisfy Assumption 1 (e). Finally, the stationary ergodic condition Assumption 1 (g) is used to obtain asymptotic normality for the QMLE-HL.
The following theorems establish the convergence rate and asymptotic normality for the QMLE-HL defined in (3.5).
Theorem 1**.**
Under Assumption 1 (a)-(f) (except for in Assumption 1 (c)), we have
[TABLE]
Theorem 2**.**
Under Assumption 1, we have as ,
[TABLE]
where
[TABLE]
and
[TABLE]
Remark 3**.**
Theorem 1 shows that the convergence rate of is . The rate is coming from the usual parametric convergence rate based on the low-frequency structure while the rate is due to the high-frequency volatility and jump variation estimations and is known as the optimal convergence rate for estimating integrated volatilities with the presence of market microstructure noises and price jumps. Theorem 2 provides the asymptotic normal distribution for . When deriving the asymptotic normality, the condition in Assumption 1 (c) is imposed so that the high-frequency estimation errors of order are negligible in comparison with the low-frequency estimation errors of order . When the condition is not satisfied, the asymptotic normality may depend on , which is the quantity related to high-frequency estimation. For example, if is some martingale difference sequence, we can relax the condition to . We also note that if the true stock prices are observed (i.e., without the microstructure noises), we only need the typical condition instead of to obtain the asymptotic normality (see Todorov, (2009)).
Remark 4**.**
We note that when replacing in Assumption 1 (e) by for some positive constant , the convergence rate in Theorem 1 will change to . On the other hand, the condition in Assumption 1 (c) will be relaxed to for deriving the asymptotic normality in Theorem 2.
3.3 Quasi-maximum likelihood estimation based on based on high-frequency data, low-frequency structure, and additional option data
3.3.1 Estimation procedure
In this section, we discuss how to incorporate additional option data information in parameter estimation. The famous Black-Scholes model indicates that option prices are determined by several factors such as time to expiration, strike price, underline asset price, and its volatility, and so one can deduce the volatility from option data. For example, the VIX presents the stock market’s general expectation of volatility. However, we usually find that the VIX is different from the historical nonparametric realized volatility. This may be because of the jumps in stock prices and the wedge between the risk-neutral and statistical probabilities. Recently, Todorov, (2019) proposed a nonparametric volatility estimator based on a portfolio of noisy short-dated option contracts with different strike prices. This estimator is robust to price jumps and does not require any assumption on the wedge between risk-neutral and statistical probabilities. Specifically, let be the time to expiration for an option contract, be the th log strike price, where and for . Let be the true option price given expiration and log-strike . Due to observation errors in empirical derivatives pricing, the observed option price obeys
[TABLE]
where the noises ’s are random variables with mean zero and satisfy the technical conditions in Todorov, (2019). Given this set-up, Todorov, (2019) proposed the following nonparametric volatility estimator
[TABLE]
where
[TABLE]
is the real part of a complex number , and is a tuning parameter.
Under some technical conditions, as goes to zero, this nonparametric volatility estimator converges to the true spot volatility (Todorov,, 2019). However, option contracts from traditional data sources such as the OptionMetrics are often quoted at the market open or close on each trading day so that the minimum choice of is business day. In this sense, may contain integrated volatility for the remaining period from time . Also Todorov, (2019) showed that the estimates ’s hold a close relationship with the jump-robust realized type volatility estimates ’s in his empirical study. Based on his results, we assume that the nonparametric volatility estimator and the conditional daily integrated volatility have the following linear relationship:
[TABLE]
where and are the intercept and slope coefficients, respectively. Moreover, ’s are martingale differences with mean zero and variance , and they are independent of the price process and the microstructure component.
Let and . Note that corresponds to the first four coordinates of and . We generalize (3.4) to propose the following joint quasi-likelihood function based on high-frequency and option data for estimating the true parameter
[TABLE]
We maximize in (3.7) to obtain parameter estimators, that is,
[TABLE]
where is the parameter space of . We call the proposed estimator (or ) in (3.8) the quasi-maximum likelihood estimator based on high-frequency data, low-frequency structure, and additional option data combined (QMLE-HLO).
3.3.2 Asymptotic theory
To establish the asymptotic behaviors of the proposed estimation method, we require the following additional assumptions.
Assumption 2**.**
- (a)
Let
[TABLE]
where are known positive constants. 2. (b)
. 3. (c)
* is a stationary ergodic process.*
The following theorems establish the convergence rate and asymptotic normality for the QMLE-HLO defined in (3.8).
Theorem 3**.**
Under Assumption 1 (a)–(f) (except for in Assumption 1 (c)) and Assumption 2 (a)–(b), we have
[TABLE]
Theorem 4**.**
Under Assumption 1 and Assumption 2, we have as ,
[TABLE]
where
[TABLE]
[TABLE]
and for . Here denotes an -by- matrix of zeros.
Remark 5**.**
Theorem 3 shows that the convergence rate for the QMLE-HLO is the same as the QMLE-HL. Theorem 4 provides the asymptotic normal distribution for the QMLE-HLO.
4 Simulation study
In this section, we conducted a simulation study to check the finite sample performance of the estimators and given by (3.5) and (3.8) respectively, as well as to investigate the prediction performance of the realized GARCH volatilities and , which was also compared with the performance of the GARCH volatilities used in Kim and Wang, (2016). Here is defined in (3.3). The true log prices , , , , were generated based on the proposed realized GARCH-Itô model defined in (2.1) and (2.3) with the following set of parameters , , , , , , and . For the jump process, we took the intensity to be 26 and generated such that , where and follows the normal distribution with mean zero and standard deviation 0.001. Each jump was further assigned to be either positive or negative randomly. The chosen parameters resulted in the following target parameter for modeling the dynamics in conditional integrated volatilities. We note that the parameter was scaled by 10000 times compared to its empirical counterpart while the rest parameters remained the same. Scaling in this simulation study was done in order to avoid the generation of any negative value for the instantaneous volatilities due to the U-shape intra-day pattern. Initial values for the simulation were chosen to be and . For the high-frequency data ’s from (3.1), market microstructure noises were added to simulated log prices ’s between integer times, and the noises were modeled by i.i.d normal random variables with mean [math] and standard deviation . For the option model described in (3.6), we took , , , where the intercept and standard deviation were scaled by roughly 10000 times comparing to their empirical estimates. We took and . For each combination of and , we repeated the simulation procedure for 2000 times. We followed the procedure as described in Fan and Wang, (2007) to detect the jump locations, estimate the jump variations, and compute the jump-adjusted MSRV estimators. Model parameter estimators were obtained by maximizing the proposed quasi-likelihood functions and defined in (3.4) and (3.7), respectively.
Table 1 reports the mean squared errors (MSEs) for the jump parameters and . We find that the MSEs decrease as the number of high-frequency observations increases for each , and larger often helps to locate the jumps and to estimate the parameters and better. Table 2 presents the MSEs for the QMLE-HL and QMLE-HLO. The proposed estimating procedures present good finite sample performances and support the theoretical results derived in Section 3. For each estimation method, as the number of low-frequency or high-frequency observations increases, the MSEs decrease. When comparing the two methods, the QMLE-HLO has smaller MSE than the QMLE-HL. Thus, it is reasonable to conclude that additional option data help to enhance the estimation of model parameters.
[FIGURE:]
[FIGURE:]
The major motivation of our model proposal is to predict future volatilities by taking advantage of the imposed autoregressive type of model structure at the low-frequency. So we examined the finite sample performance of the proposed predictors and , where and are defined in (3.5) and (3.8), respectively, and is given by (3.3). For comparison purpose, we as well investigated the prediction performance of the unified GARCH-Itô model proposed by Kim and Wang, (2016), and denote the predictor by . Specifically, we evaluated the mean squared prediction errors (MSPEs) by
[TABLE]
where is one of the followings: , , or . As a benchmark, we as well considered the prediction of using . We let the initial forecast origin to be and expanded the observation window by one low-frequency period at a time. Each time, the model parameters were estimated and the predictors were obtained.
Table 3 summarizes the MSPEs and Figure 1 presents the log MSPEs against the number of high-frequency observations. Overall, the MSPE for the realized GARCH-Itô approach decreases as the number of low-frequency or high-frequency observations increases. Moreover, the QMLE-HLO method presents the best performance regarding the MSPE. That is, the numerical results indicate that utilizing information contained in an additional data source can improve both the estimation and prediction performance of the proposed methodology. On the other hand, the unified GARCH-Itô model is not capable of explaining the rich dynamics in order to predict the conditional integrated volatilities. This may be because it takes into account neither the realized volatility nor the jump variation as an innovation. The benchmark method does not perform well because the realized GARCH-Itô model has rich dynamics that cannot be fully captured by the jump-adjusted MSRV method.
[FIGURE:]
5 Empirical analysis
In this section, we illustrate the proposed estimation methods with trading data in second for S&P500 stock index and option data quoted at the market opening on each trading day, where S&P500 stock index is the underline asset. The data sets were obtained from the TAQ and the CBOE database, respectively. We examined the period from January 3rd, 2017 to December 31th, 2018 so that the number of low-frequency periods is . The high-frequency data are available between open and close of the market so that the number of high-frequency observations for a full trading day is . We followed the procedure given in Fan and Wang, (2007) to detect jumps, as well as to compute the jump variation estimates ’s and jump-adjusted MSRV estimates ’s. We estimated the intensity by the daily averaged number of price jumps, and the parameter by the sample median of all squared price jumps because the sample median better described the center of the distribution formed by squared jumps. The estimated values are and . For the option data, we followed the procedure presented in Todorov, (2019) as their empirical study covered a similar period and considered the S&P500 index as well. Specifically, we took the option contracts where the time to expiration ranges from 1 to 2 business days and skipped the contracts that were settled on a holiday. The average number of strikes per date was and the values of the tuning parameters were set to be the same as in Todorov, (2019). Denote the option-based nonparametric volatility estimates by ’s. Figure 2 displays the auto- and cross-correlation functions (Brockwell and Davis,, 2016) for the ’s, ’s, and ’s, which provides promising evidence for explaining the rich dynamics with these innovations. The QMLE-HL estimates are , and , and the QMLE-HLO estimates are . The parameter denotes the intercept term in the realized GARCH volatility dynamics while the parameter denotes the intercept term in model (3.6). Their small estimated values reflect the overall level of daily volatilities that can be seen in Figure 3.
Figure 3 displays the jump-adjusted MSRV estimates, the option-based nonparametric volatility estimates, the realized GARCH volatility estimates from the QMLE-HL and the QMLE-HLO. For comparison purpose, we as well present the GARCH volatilities adopted in the unified GARCH-Itô model (Kim and Wang,, 2016). Figure 3 shows that the nonparametric jump-adjusted MSRV and the option-based nonparametric volatility estimates are both volatile, and the realized GARCH volatility estimates from the QMLE-HL and QMLE-HLO methods can account for these dynamics well. Moreover, when comparing with the unified GARCH-Itô estimates, the proposed realized GARCH-Itô estimates are closer to the jump-adjusted MSRV estimates. This may be because the realized GARCH-Itô model includes realized volatilities and jump variations as innovations while the unified GARCH-Itô model comprises squared daily log returns as innovations. That is, the proposed structure in the realized GARCH-Itô model helps to capture the market dynamics promptly.
To investigate the prediction performance of the proposed methodologies, we employed the MSPE criteria again. Denote the forecast origin by . To further examine the dependency of split points, we took , where each value corresponds to the last trading day of June, July, August, September, October, and November in the year of 2018. Since the exact conditional daily integrated volatilities are unknown for empirical data, we used the jump-adjusted MSRV estimates instead and evaluated the following MSPE:
[TABLE]
where is one of the followings: , , , or , and is defined in (3.3).
Table 4 summarizes the MSPEs from the realized GARCH-Itô, the unified GARCH-Itô, and the jump-adjusted MSRV estimates. Overall, the proposed realized GARCH-Itô estimates outperform the other methods in terms of the MSPE across various split points. When comparing the realized GARCH-Itô estimates, the QMLE-HLO presents smaller MSPE than the QMLE-HL. The empirical results indicate that the realized GARCH-Itô model holds advantages in predicting future volatilities as it utilizes the autoregressive structure in daily integrated volatilities and emphasizes high-frequency based information by using both realized volatilities and jump variations as innovations. Moreover, incorporating option-based nonparametric volatility estimates could help to predict future volatilities.
[FIGURE:]
6 Conclusion
In this paper, we introduce a novel realized GARCH-Itô model based on a jump-diffusion process which embeds the discrete realized GARCH model structure (Hansen et al.,, 2012) in its instantaneous volatility process. When the model is restricted to the low-frequency period, it employs an autoregressive type structure to explain the co-dynamics in the integrated volatilities and jump variations. Model parameters in the realized GARCH-Itô model are estimated by maximizing a quasi-likelihood function. To improve the statistical performance of the proposed estimating approach and to incorporate additional information from option data, we as well connect the nonparametric volatility estimator proposed by Todorov, (2019) with the conditional integrated volatility from the proposed model. A joint quasi-likelihood function is then adopted and we show that this method helps to improve accounting for the market dynamics in the numerical analysis.
We also leave some open issues for future study. For example, we may observe some heterogeneous variance in model (3.6). One possible approach is to generalize the homogeneous variance in (3.6) to heterogeneous variance such as replacing by , where parameter is used to adjust the level of heteroscedasticity with corresponding to the homogeneous case. We replace by in the quasi-likelihood given by (3.7) and then estimate jointly with the other parameters by maximizing . Moreover, it is important to explore further about the optimal approach to combine and model the return and option data for volatility estimation.
Appendix A Appendix
Let and be generic constants whose values are free of , , , and and may change from occurrence to occurrence.
A.1 Proof of Proposition 1
Proof of Proposition 1. For , let
[TABLE]
By the Itô’s Lemma, we have
[TABLE]
Then simple algebraic manipulations show
[TABLE]
Since
[TABLE]
we have
[TABLE]
where , and are defined in (2.8). Thus, we have
[TABLE]
where . Since the integrand of is predictable, is a martingale difference. Proposition 1 (b) and (c) can be showed immediately following the results of Proposition 1 (a).
A.2 Proof of Theorem 1
Maximizing proposed in Section 3.2 is equivalent to maximizing
[TABLE]
We focus on defined above in this proof. Define
[TABLE]
To ease notations, we denote derivatives of any given function at by
[TABLE]
Lemma 1 in Kim and Wang, (2016) shows that the dependence of on the initial value decays exponentially. Thus, we may use the true initial value during the rest of the proofs.
Lemma 1**.**
Under Assumption 1 (a)-(f), we have
- (a)
* and * 2. (b)
for any ,
[TABLE]
for any , where .
Proof of Lemma 1. The statements can be showed similar to the proofs of Lemma 2 (Kim and Wang,, 2016).
Lemma 2**.**
Under Assumption 1 (a)-(d), we have
- (a)
there exists a neighborhood of such that
[TABLE]
for any where ; 2. (b)
* is a positive definite matrix for .*
Proof of Lemma 2. The proof is in the online Appendix.
Lemma 3**.**
Under Assumption 1 (a)-(f), we have
[TABLE]
Proof of Lemma 3. The proof is in the online Appendix.
Proposition 2**.**
Under Assumption 1 (a)-(d), there is a unique maximizer of and as , in probability.
Proof of Proposition 2. The statement can be showed similar to the proofs of Theorem 1 (Kim and Wang,, 2016) together with the result of Lemma 3.
Proof of Theorem 1. By the mean value theorem and Taylor expansion, there exists between and such that
[TABLE]
If which is a positive definite matrix by Lemma 2 (b), the convergence rate of is the same as that of . Thus, it is enough to show
[TABLE]
and
[TABLE]
First consider . Similar to the proofs of Theorem 2 (Kim and Wang,, 2016), we can show that
[TABLE]
By the application of the Itô’s lemma and Itô’s isometry, we can show for any ,
[TABLE]
where the last inequality is due to Lemma 1 (b). Similar to the proofs of Theorem 2 (Kim and Wang,, 2016) together with the results of Lemma 2 and Proposition 2, we can show
[TABLE]
A.3 Proof of Theorem 2
Proof of Theorem 2. By the mean value theorem and Taylor expansion, we have for some between and ,
[TABLE]
where the second equality is due to (A.4). By the ergodic theorem and the result in the proof of Theorem 1, we have
[TABLE]
and is a positive definite matrix. For any , let
[TABLE]
Then is a martingale difference with .
Since ’s are stationary and ergodic processes, is also stationary and ergodic. By the martingale central limit theorem and Cramr-Wold device, we have
[TABLE]
Therefore, by Slutsky’s theorem, we conclude that
[TABLE]
A.4 Proof of Theorem 3
Maximizing is equivalent to maximizing
[TABLE]
where . We focus on defined above in this proof. Define
[TABLE]
and
[TABLE]
[TABLE]
and
[TABLE]
[TABLE]
and
[TABLE]
Lemma 4**.**
Under Assumption 1 (a)–(f) and Assumption 2 (a)–(b),
- (a)
there exists a neighborhood around such that
[TABLE]
for any , where ; 2. (b)
* is a positive definite matrix for .*
Proof of Lemma 4. The proof is in the online Appendix.
Lemma 5**.**
Under Assumption 1 (a)-(f) and Assumption 2 (a)–(b), we have
[TABLE]
Proof of Lemma 5. The proof is in the online Appendix.
Proposition 3**.**
Under Assumption 1 (a)-(f) and Assumption 2 (a)–(b), there exists a unique maximizer for . As , in probability, where is a vector of true parameters.
Proof of Proposition 3. According to the definition of , we have
[TABLE]
Then, similar to the proofs in Theorem 1 of Kim and Wang, (2016), we can show the uniqueness of the solution of , which together with Lemma 5 implies Proposition 3.
Proof of Theorem 3. By the mean value theorem and Taylor expansion, we have
[TABLE]
where is between and . According to Lemma 4 (b), is a positive definite matrix. If , then the convergence rate of is the same as the convergence rate of .
By the similar arguments in the proof of Theorem 1, we can show
[TABLE]
We have
[TABLE]
The arguments in the proof of Theorem 1 shows that the first term of the right side of (A.6) is . Since is independent of , the second term of the right side of (A.6) is also . Thus, the convergence rate of is .
Similar to the proof of Theorem 1, we can show
[TABLE]
Therefore, the statement is proved.
A.5 Proof of Theorem 4
Proof of Theorem 4. Since the mean value theorem and Taylor expansion provides
[TABLE]
where is between and , we have
[TABLE]
where the equality can be showed similar to the proof of Theorem 1. Since is independent of and is stationary and ergodic, by the Cramér-Wold device and the martingale central limit theorem, we have
[TABLE]
On the other hand, we have
[TABLE]
Therefore, by the Slutsky’s theorem, we have
[TABLE]
Acknowledgements
The research of Xinyu Song was supported by the Fundamental Research Funds for the Central Universities (2018110128), China Scholarship Council (201806485017), and National Natural Science Foundation of China (Grant No. 11871323). The research of Donggyu Kim was supported in part by KAIST Settlement/Research Subsidies for Newly-hired Faculty grant G04170049 and KAIST Basic Research Funds by Faculty (A0601003029). The research of Huiling Yuan was supported by the State Scholarship Fund. The research of Xiangyu Cui was supported by National Natural Science Foundation of China (71671106). The research of Zhiping Lu was supported by Natural Science Foundation of Shanghai (17ZR1409000) and the 111 Project (B14019). The research of Yong Zhou was supported by the National Natural Science Foundation of China (71931004 and 91546202). The research of Yazhen Wang was supported in part by NSF Grants DMS-15-28735, DMS-17-07605, and DMS-19-13149.
We thank the Associate Editor, Viktor Todorov, and two anonymous referees for many constructive suggestions that have significantly improved the paper.
This research was performed using the compute resources and assistance of the UW-Madison Center For High Throughput Computing (CHTC) in the Department of Computer Sciences. The CHTC is supported by UW-Madison, the Advanced Computing Initiative, the Wisconsin Alumni Research Foundation, the Wisconsin Institutes for Discovery, and the National Science Foundation, and is an active member of the Open Science Grid, which is supported by the National Science Foundation and the U.S. Department of Energy’s Office of Science.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Admati and Pfleiderer, (1988) Admati, A. R. and Pfleiderer, P. (1988). A theory of intraday patterns: Volume and price variability. The Review of Financial Studies , 1(1):3–40.
- 2Aït-Sahalia et al., (2010) Aït-Sahalia, Y., Fan, J., and Xiu, D. (2010). High-frequency covariance estimates with noisy and asynchronous financial data. Journal of the American Statistical Association , 105(492):1504–1517.
- 3Aït-Sahalia et al., (2012) Aït-Sahalia, Y., Jacod, J., and Li, J. (2012). Testing for jumps in noisy high frequency data. Journal of Econometrics , 168(2):207–222.
- 4Ait-Sahalia and Yu, (2009) Ait-Sahalia, Y. and Yu, J. (2009). High frequency market microstructure noise estimates and liquidity measures. Annals of Applied Statistics , 3(1):422–457.
- 5Andersen et al., (2007) Andersen, T. G., Bollerslev, T., and Diebold, F. X. (2007). Roughing it up: Including jump components in the measurement, modeling, and forecasting of return volatility. The review of economics and statistics , 89(4):701–720.
- 6Andersen et al., (2003) Andersen, T. G., Bollerslev, T., Diebold, F. X., and Labys, P. (2003). Modeling and forecasting realized volatility. Econometrica , 71(2):579–625.
- 7Andersen et al., (1997) Andersen, T. G., Bollerslev, T., et al. (1997). Intraday periodicity and volatility persistence in financial markets. Journal of empirical finance , 4(2-3):115–158.
- 8Andersen et al., (2019) Andersen, T. G., Thyrsgaard, M., and Todorov, V. (2019). Time-varying periodicity in intraday volatility. Journal of the American Statistical Association , 114(528):1695–1707.
