Bayesian Nonparametric Adaptive Spectral Density Estimation for   Financial Time Series

Nick James; Roman Marchant; Richard Gerlach; Sally Cripps

arXiv:1902.03350·q-fin.ST·February 12, 2019

Bayesian Nonparametric Adaptive Spectral Density Estimation for Financial Time Series

Nick James, Roman Marchant, Richard Gerlach, Sally Cripps

PDF

Open Access

TL;DR

This paper introduces a Bayesian nonparametric method for adaptive spectral density estimation in financial time series, effectively distinguishing non-stationarity from long-range dependency.

Contribution

It develops a novel Bayesian framework with reversible jump MCMC for jointly modeling non-stationarity and dependency in financial data, allowing non-parametric spectral estimation.

Findings

01

Method performs well across various simulated data.

02

Real data analysis shows presence of long-range dependency and non-stationarity.

03

Provides a new approach for analyzing complex financial time series.

Abstract

Discrimination between non-stationarity and long-range dependency is a difficult and long-standing issue in modelling financial time series. This paper uses an adaptive spectral technique which jointly models the non-stationarity and dependency of financial time series in a non-parametric fashion assuming that the time series consists of a finite, but unknown number, of locally stationary processes, the locations of which are also unknown. The model allows a non-parametric estimate of the dependency structure by modelling the auto-covariance function in the spectral domain. All our estimates are made within a Bayesian framework where we use aReversible Jump Markov Chain Monte Carlo algorithm for inference. We study the frequentist properties of our estimates via a simulation study, and present a novel way of generating time series data from a nonparametric spectrum. Results indicate…

Equations53

y_{t} = s = 1 \sum K y_{t}^{(s)} δ (t, A_{s, K})

y_{t} = s = 1 \sum K y_{t}^{(s)} δ (t, A_{s, K})

p (y ∣ F_{K}, K, ξ_{K}) =

p (y ∣ F_{K}, K, ξ_{K}) =

s = 1 \prod K

x_{s} (ν_{k}) = \frac{1}{n _{s}} t = 1 \sum n_{s}

x_{s} (ν_{k}) = \frac{1}{n _{s}} t = 1 \sum n_{s}

(cos (2 π ν_{k} t) - i sin (2 π ν_{k} t)),

I_{s}(\nu_{k})=\big{|}x_{s}(\nu_{k})\bar{x}_{s}(\nu_{k})\big{|}\,\,.

I_{s}(\nu_{k})=\big{|}x_{s}(\nu_{k})\bar{x}_{s}(\nu_{k})\big{|}\,\,.

x_{s} \sim k = 1 \prod n_{s} \frac{1}{π f _{s} ( ν _{k} )} exp (- \frac{I _{s} ( ν _{k} )}{f _{s} ( ν _{k} )}) .

x_{s} \sim k = 1 \prod n_{s} \frac{1}{π f _{s} ( ν _{k} )} exp (- \frac{I _{s} ( ν _{k} )}{f _{s} ( ν _{k} )}) .

lo g (I_{s} (ν_{k})) = lo g (f_{s} (ν_{k})) + ϵ_{k}; ϵ_{k} \sim lo g (exp (1))

lo g (I_{s} (ν_{k})) = lo g (f_{s} (ν_{k})) + ϵ_{k}; ϵ_{k} \sim lo g (exp (1))

w_{s} (ν_{k}) = g_{s} (ν_{k}) + ϵ_{k},

w_{s} (ν_{k}) = g_{s} (ν_{k}) + ϵ_{k},

h_{s} (ν_{k}) = τ_{s} W (ν_{k})

h_{s} (ν_{k}) = τ_{s} W (ν_{k})

h_{s} = (h_{s} (ν_{1}), \dots, h_{s} (ν_{n_{s}})) \sim N (0, τ_{s}^{2} Ω)

h_{s} = (h_{s} (ν_{1}), \dots, h_{s} (ν_{n_{s}})) \sim N (0, τ_{s}^{2} Ω)

Pr (K) = \frac{1}{S}

Pr (K) = \frac{1}{S}

Pr (ξ_{K} ∣ K) = s = 1 \prod K - 1 Pr (ξ_{s, K} ∣ ξ_{s - 1}, K),

Pr (ξ_{K} ∣ K) = s = 1 \prod K - 1 Pr (ξ_{s, K} ∣ ξ_{s - 1}, K),

x_{r}

x_{r}

x_{i}

x_{(0, r)}

x_{(0, r)}

x_{(0, i)}

x_{(1 : \frac{n _{s}}{2} - 1, r)}

x_{(1 : \frac{n _{s}}{2} - 1, i)}

x_{(\frac{n _{s}}{2}, r)}

x_{(\frac{n _{s}}{2}, r)}

x_{(\frac{n _{s}}{2}, i)}

x_{(\frac{n}{2} + 1 : n - 1, r)}

x_{(\frac{n}{2} + 1 : n - 1, r)}

x_{(\frac{n}{2} + 1 : n - 1, i)}

E [f (ν, t) ∣ y] = K = 1 \sum S j = 1 \sum p^{(K, T)} {f (ν, t) ∣ y, K, ξ_{K})}

E [f (ν, t) ∣ y] = K = 1 \sum S j = 1 \sum p^{(K, T)} {f (ν, t) ∣ y, K, ξ_{K})}

\times Pr (ξ_{S} ∣ K, y) Pr (K ∣ y)

E [f (ν, t) ∣ y, K, ξ_{K}] =

E [f (ν, t) ∣ y, K, ξ_{K}] =

\int E [f (ν, t)

y_{t}

y_{t}

σ_{t}^{2}

y_{t} ∣ s_{t} = j

y_{t} ∣ s_{t} = j

σ_{j t}^{2} ∣ s_{t} = j

S K L

S K L

M S E

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComplex Systems and Time Series Analysis · Time Series Analysis and Forecasting · Stock Market Forecasting Methods

Full text

Bayesian Nonparametric Adaptive Spectral Density Estimation

for Financial Time Series

Nick James

Roman Marchant

Richard Gerlach

Sally Cripps

Abstract

Discrimination between non-stationarity and long-range dependency is a difficult and long-standing issue in modelling financial time series. This paper uses an adaptive spectral technique which jointly models the non-stationarity and dependency of financial time series in a non-parametric fashion assuming that the time series consists of a finite, but unknown number, of locally stationary processes, the locations of which are also unknown. The model allows a non-parametric estimate of the dependency structure by modelling the auto-covariance function in the spectral domain. All our estimates are made within a Bayesian framework where we use a Reversible Jump Markov Chain Monte Carlo (RJMCMC) algorithm for inference. We study the frequentist properties of our estimates via a simulation study, and present a novel way of generating time series data from a nonparametric spectrum. Results indicate that our techniques perform well across a range of data generating processes. We apply our method to a number of real examples and our results indicate that several financial time series exhibit both long-range dependency and non-stationarity.

Bayesian Nonparametrics, Spectral Density Estimation, Reversible Jump Markov Chain Monte Carlo, Financial Time Series

1 Introduction

Modelling the volatility of financial time series has been the subject of much interest since the deregulation of world financial markets, which began in the late 1970’s. It is a difficult task. First financial time series are often non-stationary, by which we mean that the statistical properties change over time, making the development of statistical models problematic. Second, even stationary financial time series exhibit non-standard features such as volatility clustering (Mandelbrot, 1963) and related kurtosis. Third, the signal to noise ratio is high making it difficult to detect any underlying trends. The primary contribution of this paper is to work in the spectral domain to capture and distinguish between features of time series data, such as non-stationarity and long-range dependency and compare estimates of these features with estimates obtained using parametric time domain models.

Models for the time-varying nature of volatility in financial markets began with Engle (1982), who introduced the AutoRegressive Conditionally Heteroskedastic (ARCH) model. Bollerslev (1986) extended this model to the more parsimonious Generalized ARCH (GARCH), while Taylor (1982, 1986a, 1986b) developed the Stochastic Volatility (SV) framework over the same period. These first generation volatility models are conditionally Gaussian, with the dynamic volatility component meant to account for the leptokurtosis present in most financial return series. Bollerslev (1987) allowed for conditionally Student-t returns in a GARCH process, specifically increasing the level of this aspect able to be captured. Various salient features of observed financial returns, e.g. the leverage effect, whereby volatility is higher in falling, compared to rising, markets, and the non-stationary aspects, including the time-varying nature of the conditional distribution of returns and possible structural break components, are allowed for in subsequent extended GARCH models in the literature. These include the EGARCH (Nelson, 1991), GJR-GARCH (Glosten et al., 1993), T-GARCH (Zakoian, 1994) and T-SV (So et al., 2002) models, all attempting to capture the leverage effect; and via Markov Switching GARCH (MS-GARCH) (Cai, 1994; Hamilton & Susmel, 1994; Gray, 1996; Haas et al., 2004) and MS-SV models (So et al., 1998).

Bayesian estimation has been prominent in this area, especially for SV models where the likelihood, without conditioning on the latent stochastic process, does not exist in closed form and simulation-based and/or data augmentation methods, including Markov Chain Monte Carlo (MCMC), are standard. As high frequency data became more and more available, volatility modelling moved first to directly model realized measures, such as realized variance (Andersen et al., 2003), and then to extensions of GARCH and SV processes, allowing realized measures as inputs that drive volatility changes in the model, e.g. the GARCH-X model of Hwang & Satchell (2007). More recently, Hansen et al. (2011) developed the realized GARCH framework, allowing an extra measurement equation capturing the contemporaneous relationship between the latent volatility and the realized measure. All these models make assumptions about distribution of the noise and have parametric representations of the evolution of the volatility and although some methods may explicitly model regime shifts and stochastic behaviour, if the parametric form of the model does not resemble the underlying phenomenology of the data generation process it will perform poorly.

Flexibly estimating the time-dependency of a phenomenon via the spectral density goes back to the 1950’s (Whittle, 1957). However, it is not often applied to financial time series, despite several appealing reasons for doing so as pointed out by Chaudhuri & Lo (2015). First, studying frequency components of security’s return processes can provide insight into previously unseen economic structure driving price movements. Secondly, as investment time horizons can range from microseconds to many years, time-specific risks can be accounted for in portfolio construction decisions. Thirdly, frequency domain analysis can also compare strategies that operate on different timescales, and may provide diversification across investment strategies operating on varying timescales. Finally, frequency-domain measurements offer more understandable representations of the complex periodic dynamics financial markets may exhibit.

In addition to these advantages, latest developments in spectral analysis, in particular the concept of local stationarity developed by Dahlhaus (1997) and built on by Rosen et al. (2009, 2012), have led to the development of flexible nonparametric methods for estimating time varying spectra.

We use the technique of Rosen et al. (2012) to jointly estimate a time-varying non-parametric spectrum for financial time series data, and to distinguish between non-stationarity and long-range dependency, as evidenced by volatility clustering. We use this flexible time-varying spectrum to simulate ”ground truth” data in the spectral domain and convert back into time domain. This enables us to compare time domain models of volatility with each other and with spectral techniques. All our parameter estimates are made within a Bayesian framework where we use a MCMC algorithm for inference. Rather than modelling volatility with a conditionally stationary process as proposed by ARCH/GARCH/SV-style models, we assume that the data generating process is non-stationary, and consists of an unknown number and location of locally stationary processes.

The remainder of this paper is organised as follows. Section 2 describes the model, priors and estimation procedure. Section 3 shows validation over simulated and real-world data. Finally, Section 4 draws conclusions from our experiments.

2 Model, Priors and Estimation

2.1 Model for Non-stationary processes

Suppose $\{y_{t}\}_{t=1,\ldots,T}$ is a time series with observations from a Dahlhaus locally stationary process with evolutionary spectrum $f(\nu,t)$ , which we wish to estimate. To do this, we assume that the time series consists of $K$ piecewise stationary processes, each of length $n_{s}$ for $s=1,\ldots,K$ . Given a partition of $K$ segments, we define the partition points to be $\xi_{K}=(\xi_{0,K},\xi_{1,K}\ldots\xi_{K,K})$ , with $\xi_{0,K}=0$ and $\xi_{K,K}=T$ so that the set $A_{s}$ is given by $A_{s}=\{t;\xi_{s-1}+1<t<\xi_{s}\}$ as in (Rosen et al., 2012). Therefore, we can rewrite

[TABLE]

where, $\delta(t,A_{s,K})=1$ , if $t\in A_{s}$ and $\delta(t,A_{s,K})=0$ otherwise, and where the $y^{(s)}_{t}$ ’s are independent stationary processes, for $s=1,\ldots,K$ , each with spectral density $f_{s,K}(\nu)$ .

The joint probability density function of a realization $\mathbf{y}=(y_{1},\ldots,y_{T})$ given the individual spectra $\mathbf{F}_{K}=(\mathbf{f}_{1,K}(\nu),\ldots,\mathbf{f}_{K,K}(\nu))$ , the number of segments $K$ , and the partition points $\bm{\xi}_{K}$ is

[TABLE]

2.2 Priors

2.2.1 Prior for Spectra

Given a partition defined by $K$ segments and their respective parition points $\bm{\xi}_{K}$ , and a realization $\mathbf{y}^{s)}$ , our goal is to estimate the unknown spectra $f_{s,K}(\nu)$ , for $\nu\in(0.5,0.5)$ . To motivate a prior for $f_{s,K}(\nu)$ , we frame the problem of estimating the autocovariance structure of a time series, given by the spectrum, as a nonparametric regression estimation problem. In effect turning a covariance estimation problem into a mean estimation problem, which is more parsimonious and tractable.

To elaborate, define the Discrete Fourier Transform (DFT) for segment $s$ of length $n_{s}$ , at frequency $\nu_{k}$ to be

[TABLE]

where $\nu_{k}=k/n_{s}\,\,\forall k\in\{0,1,\ldots,(n_{s}-1)\}$ . Let the periodogram at frequency $\nu_{k}$ , $I(\nu_{k})$ , be the squared modulus of the DFT

[TABLE]

Then Whittle (Whittle, 1957) showed that the distribution of $\mathbf{x}_{s}=\left(x_{s}(\nu_{1})\ldots,x_{s}(\nu_{n_{s}})\right)$ , under certain regularity conditions, is complex normal so that

[TABLE]

This representation suggests that the $I_{s}(\nu_{k})$ are i.i.d. with $I_{s}(\nu_{k})\sim\exp(f_{s}(\nu_{k}))$ and therefore

[TABLE]

Letting $w_{s}(\nu_{k})=\log\left(I_{s}(\nu_{k})\right)$ and $g_{s}(\nu_{k})=\log\left(f_{s}(\nu_{k})\right)$ we have

[TABLE]

To place a prior on the unknown function $g_{s}(\nu_{k})$ we decompose it into its linear and non-linear components so that $g_{s}(\nu_{k})=\alpha_{s0}+h_{s}(\nu_{k})$ and place a Gaussian Process prior over the unknown function $h_{s}(\nu_{k})$ , see for example (Wahba, 1990). Specifically we assume

[TABLE]

or equivalently,

[TABLE]

where $W(.)$ is a Wiener process, $\tau_{s}^{2}$ is a smoothing parameter and the $i^{th}$ , $j^{th}$ element of $\Omega$ , $\omega_{ij}=\mbox{cov}(h_{s}(\nu_{i}),h_{s}(\nu_{j}))=\min(\nu_{i},\nu_{j})$ .

For computational convenience we write $\mathbf{h}_{s}$ as a linear combination of basis functions by performing an eigenvalue decomposition on $\Omega=QDQ^{\prime}$ . Specifically we let $X=QD^{1/2}$ be the design matrix and $\bm{\beta}_{s}\sim(0,\tau^{2}_{s}I_{n_{s}})$ be the vector of regression coefficients, so that $\mathbf{h}_{s}=X\beta_{s}$ has the required distribution. We follow Wood et al. (Wood et al., 2002) and Rosen et al. (Rosen et al., 2009) and keep only those basis functions corresponding to the 30 largest eigenvalues, for computational speed.

2.2.2 Prior for Partition

The partition is defined by the number of of locally stationary segments $K$ and the partition points, $\bm{\xi}_{K}$ , given $K$ . The prior on the partition $\Pr(K,\bm{\xi}_{S})=\Pr(\bm{\xi}_{s}|K)\Pr(K)$ $\bm{\xi}_{s,K}$ is as follows;

[TABLE]

where $S$ is the the upper limit for the number of segments, in the experiments which follow this is typically set to be 30. Given $K$ we decompose the prior on $\bm{\xi}_{K}$ into a sequence of discrete uniform priors,so that

[TABLE]

where $\Pr(\xi_{j,m}=t\mid m)$ = $1/p_{s,K},$ for $s=1,\ldots,K-1,$ $p_{s,K}$ is the number of available locations for partition point $\xi_{s,K}$ and is equal to $T-\xi_{s-1,K}-(K-s+1)t_{\min}+1$ . The quantity $t_{\min}$ is a user chosen number. It represents the minimum number of observations that are deemed sufficient for the Whittle likelihood approximation to hold. In this paper we set this to be 50, however we note that this is arbitrary, and indeed there is a substantial literature which discusses the quality of the Whittle approximation.

The prior in Equation 11 states that the first partition point is equally likely to occur at any point in the time series subject to the constraint that there are at least $t_{\min}$ observations in each of the $K$ segments. The prior on subsequent partition points is similar and states that, conditional on the previous partition point, the next partition point is equally likely to occur in any available location, again subject to the same constraint see (Rosen et al., 2012) for details.

2.3 Generation of Temporal Data

A contribution of this paper is to use the time-varying spectra estimated as in (Rosen et al., 2012) to generate a time series, without assuming the time domain data generating process. This is achieved using the result that the DFT’s of the realization of a process, are approximately normally distributed if the joint cumulants of that process, of orders greater than 2, are absolutely summable (Brillinger, 1975).

Let

[TABLE]

be the real and imaginary components of the DFT for a set of realizations from a locally stationary process $s$ of length $n_{s}$ . The distribution of these quantities for a zero-mean process are;

[TABLE]

where $\delta(.)$ is the Dirac delta function. If $n$ is even then

[TABLE]

To ensure symmetry we set

[TABLE]

So that given a time-varying spectrum $f(\nu,t)$ , for $\xi_{s-1,S}<t\leq\xi_{s,S}$ we generate $\mathbf{x}_{r}$ and $\mathbf{x}_{i}$ and form $\mathbf{x}=\mathbf{x}_{r}+i\mathbf{x}_{i}$ and apply the Inverse-DFT to generate the time series corresponding to each locally stationary process and so obtain a time domain realization from a non-stationary process.

2.4 Estimation

In this paper we take a Bayesian approach and estimate the unknown time-varying spectrum by its posterior mean

[TABLE]

where the sum is over all possible partitions and

[TABLE]

We use Reversible Jump MCMC (RJMCMC) to perform the required transdimensional integration, see (Rosen et al., 2012) for details.

3 Experiments

This section validates the use of more flexible, adaptive non-parametric models for estimating spectrum of financial time series and its volatility. The experiment setup is as follows, we evaluate the goodness of fit for different techniques over data with a known generative process and over real-world data from the daily returns and squared returns of the NASDAQ Index from 2002-2018 and the GBP:USD from 2010-2018. Section 3.1 presents details on the data generation processes and evaluation of results for a known time-varying spectral density. Section 3.3 shows the results of fitting different models over the returns and squared returns of the NASDAQ and GBP:USD.

3.1 Simulated Data

To compare the performance of various models for financial returns and volatility, in terms of the ability of the model to recover the true data generating process, we simulate data using three models for time series. The first model is a stationary process, while the second and third models are non-stationary processes. The first model is a GARCH (1,1) process. The second model is a regime-switching GARCH (1,1) process (Haas et al., 2004; Ardia, 2016) and the third is the AdaptSpec model of Rosen et al. (2012). The data generating process of a GARCH(1,1) model 1(d) is given by

[TABLE]

where $\eta_{t-1}=y_{t-1}-\mu$ . We set $\mu=0,\alpha_{0}=1,\alpha_{1}=0.1,\beta_{1}=0.1.$ , so that the process is stationary with an unconditional variance, $\sigma^{2}_{uc}=\frac{\alpha_{0}}{(1-\alpha_{1}-\beta_{1})}$ . Figure 1(a) shows a sample spectrum and Figure 1(d) the associated realisation in time.

The second model we generate data from is a Regime-Switching GARCH model 1(e) as in (Ardia, 2016; Haas et al., 2004). Specifying a model which allows for regime-switching is one way of accounting for non-stationarity. For each point in time $t$ , a latent state variable $s_{t}$ for $t\in\{1,2,..,T\}$ , determines the regime from which the observation is generated. Let $\Pr(s_{t}=j|\mathbf{y})$ be the probability that an observation at time $t$ was generated by regime $j$ , for $j=1,\ldots,N_{R}$ , where $N_{R}$ is the number ofpossible regimes. Our Regime-Switching model is the following (Bauwens et al., 2014; Haas et al., 2004)

[TABLE]

For our simulation we set $N_{R}=2$ . Define $K_{R}$ to be the number of segments generated by the $N_{R}$ regimes, so that $K_{R}\geq N_{R}$ . The location of the regime switches are defined by the cutpoints $\mathbf{c}=(c_{1},\ldots,c_{K_{R}})$ . Let $\mathbf{r}=(r_{1},\ldots,r_{K_{R}})$ be an indicator vector denoting the regime which generates the data in segment $k$ , so that $r_{k}=j$ , if segment $k$ was generated by regime $j$ . For our simulation we set $K_{R}=3$ , $\mathbf{c}=(1000,3000,5000)$ and $\mathbf{r}=(1,2,1)$ . Our set of parameters in our regime switching model are, $\alpha_{0,1}=1$ , $\alpha_{1,1}=0.1$ , $\beta_{1,1}=0.1$ , $\alpha_{0,2}=1$ , $\alpha_{1,2}=0.3$ and $\beta_{1,2}=0.2$ .

The third model for generating data is now described. Using the model in (Rosen et al., 2012) we obtained an estimate of the posterior mode of the number of locally stationary processes for the NASDAQ daily returns from 2002 to March 2018, denoted by $\hat{K}_{NAD}$ and an estimate of the posterior mean of the spectra1(c) corresponding to those locally stationary processes. We generated 50 realizations 1(f) of the real and imaginary components of the DFT’s, $\mathbf{x}_{s,r}$ and $\mathbf{x}_{s,i}$ respectively each of length $n_{s,\hat{K}_{NAD}}$ , then the inverse-DFT was applied to obtain 50 time series, $\mathbf{y}_{s}$ each of length $n_{s,\hat{K}_{NAD}}$ , for $s=1,\ldots,\hat{K}_{NAD}$ as described in Section 2.3. These $\hat{K}_{NAD}$ time series were concatenated, so that 50 realizations of a non-stationary process, of length $\sum_{s=1}n_{s\hat{K}_{NAD}}$ , were obtained.

In what follows we shall refer to these three data generating processes as GARCH, Regime and AdaptSpec.

3.2 Metrics to measure performance

To assess the relative performances of the GARCH, Regime, and the AdaptSpec models we use Mean Squared Error (MSE) and Symmetric Kullback Liebler (SKL) divergence. We define the quantities as follows

[TABLE]

where $f(\nu,t)$ is the true time-varying spectrum and $\hat{f}(\nu,t)$ is an estimate of this true spectrum. In what follows we use the subscripts $G$ , $R$ , and $AD$ , to refer to the GARCH, Regime and AdaptSpec models respectively. Plots of the true log spectra ${f}_{G}(\nu,t)$ , ${f}_{R}(\nu,t)$ and ${f}_{AD}(\nu_{t},t)$ , used to generate the data along with an example of a realization appear in Figure 1.

Boxplots of the $\log(SKL)$ and $\log(MSE)$ for all three estimators and all three data generating models appear in Figure 3. We chose to plot the log of these validation metrics, rather than the metrics itself, because the difference between the values of the $SKL$ and $MSE$ for three estimators is very large.

As expected, when data are generated from a particular model, the estimates obtained from the method which assumes that particular model provide the best fit, (except in certain circumstance with the Regime model which will be discussed later). However, the plots also show that the estimates obtained from the AdaptSpec model when the data are generated from the GARCH or Regime models are always the next best. For example 3, where the true model is a single GARCH model, which is the same as a Regime model where the number of regimes is equal to one, AdaptSpec outperforms the estimate obtained using the REGIME model. In other words, the improvement gained by using a flexible model, when flexibility is required, exceeds the loss of using a flexible model when flexibility is not required.

The performance of the Regime model when data are generated from a GARCH model warrants further explanation. Our experience of using the model by (Ardia, 2016), shows that unless the true number of regimes is equal to the user-set number of regimes, results are highly variable. Part of the issue is an over-identification problem. If a single GARCH model is the truth but one estimates the spectrum using a regime switching model, where the number of regimes is greater one, then there are infinitely many different combinations which could recover the truth. While this should not necessarily present a problem with the estimated fit or prediction (as opposed to parameter inference), it does. This appears to be due to the fact that the probability of being in a particular regime can change abruptly on a daily basis. These estimated probabilities in turn, are very sensitive to the specification of the particular type of GARCH model assumed to generate data in the different regimes.

For example, we reproduced the ”smoothed” probabilities obtained for the time series of the daily returns for the Swiss Market Index, which was analyzed by Ardia (2016) These probabilities are the blue line in Figure 6, and are estimated using a Regime model assuming two GJR-GARCH processes. However, if we assume that the underlying data generating process for the regimes is a GARCH(1,1) rather than a GARCH(1,1) with a GJR variance specification (Haas et al., 2004), then we obtained the estimated smoothed probabilities given by the red line in Figure 6. The difference is striking.

These results also explain why AdaptSpec performs well across a range of data generating process; Adaptspec is a non-parametric model, so that by estimating the dependency in the frequency domain we avoid making any assumptions about the data generating process in the time domain.

3.3 Real Examples: NASDAQ, GBP:USD

It is well known that the distribution of many financial assets are non-normally distributed, and exhibit volatility clustering. Whether this volatility clustering is evidence of long-range dependence in a stationary process, or attributable to non-stationarity is less clear. In this section we attempt to answer this question by estimating the potentially time-varying spectrum of a financial time series’ actual and squared returns. The time-varying spectrum of the actual return series is a non-parametric estimate of the evolution of the second moment of the return series’ distribution, while the time-varying spectrum of the squared return series is a non-parametric estimate of the evolution of the fourth-moment. We choose the NASDAQ daily returns and the GBP:USD exchange rate daily returns from 2002-2018 and 2010-2018 respectively to demonstrate the technique.

Figure 5(a) 5(b) show the actual return series for the NASDAQ index and its estimated time-varying spectrum, while panels 5(c) 5(d) show the squared returns for the NASDAQ index and its corresponding estimated time-varying spectrum. Figure 4 is an analogous plot for the GBP:USD exchange rate.

Figure 5 provides several insights into the stationarity and dependency of the NASDAQ returns. First, the series is definitely non-stationary. The posterior mode of the number of locally stationary segments for the return series is 12. Second, it would appear that the market for the NASDAQ index is weak-form inefficient at several points in time. A weak-form efficient market is characterised by having zero autocorrelation in the first moment of the return distribution, and hence a flat spectrum. 5(b) of Figure 5 shows several periods of time where the assumption of weak-form efficiency is violated, of particular note is the spectrum during the Global Financial Crisis (GFC) in 2008-2009, which shows a clear peak. 5(d) of Figure 5 shows that the volatility clustering is not removed even after accounting for non-stationarity. If non-stationarity accounted for volatility clustering then we would expect the locally stationary spectra of the squared return to be flat, however 5(d) shows that there is still strong positive correlation of the squared returns, as evidenced by the peak in power at low frequency for most of the time periods.

Figure 4 paints a similar picture for the GBP:USD exchange rate; the time series is clearly non-stationary, showing an overall increase in variability at the time of the Brexit vote with an accompanying dependency in the first moment of the series at that time, indicating violations of weak-form efficiency. However, the time varying spectral density of the GBP:USD squared returns as seen in 4(d) provides some interesting insights - distinguishing the behaviour of the GBP:USD’s volatility with that of the NASDAQ Index. In particular, it indicates that non-stationarity drives the volatility clustering behaviour of the returns. This is clear because unlike the NASDAQ squared returns spectrum 5(d), the GBP:USD squared returns spectrum 5(d) is predominantly flat within any candidate segment - suggesting that the larger non-stationary process is in fact piecewise stationary.

4 Conclusions

Our experiments indicate that given a non-stationary data generating process, nonparametric models outperform parametric models, where the latter assumes a constant structure over time. Our simulations demonstrate that there is less estimation error in applying a flexible method such as AdaptSPEC to a parametric data generating process, than applying a parametric model to a non-stationary data generating process. For validation, we generate ”ground truth” data in the spectral domain, and compare the resulting estimation from time domain models with spectral analysis techniques. The time series we generate after converting our ground truth spectrum into a time series strongly resembles many financial time series (such as the NASDAQ), and illustrates the need for flexible nonparametric models to capture the complex, non-stationary structure of the underlying time series.

Bibliography29

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Andersen et al. (2003) Andersen, T., Bollerslev, T., Diebold, F., and Labys, P. Modeling and forecasting realized volatility. Econometrica , 71(2):579–625, 2003.
2Ardia (2016) Ardia, D. Markov-Switching GARCH Models in R: The MSGARCH Package. Journal of Statistical Software , 2016.
3Bauwens et al. (2014) Bauwens, L., Backer, B., and Dufays, A. A Bayesian Method of Change-Point Estimation with Recurrent Regimes: Application to GARCH Models. Journal of Empirical Finance , 29:207–229, 2014.
4Bollerslev (1986) Bollerslev, T. Generalized Autoregressive Conditional Heteroskedasticity. Journal of Econometrics , 31:307–327, 1986.
5Bollerslev (1987) Bollerslev, T. A Conditionally Heteroskedastic Time Series Model for Speculative Prices and Rates of Return. The Review of Economics and Statistics , 69(3), 1987.
6Brillinger (1975) Brillinger, D. Time Series: Data Analysis and Theory . Holt, Rinehart, and Winston, 1975.
7Cai (1994) Cai, J. A markov model of switching-regime arch. Journal of Business & Economics Statistics , 12:309–316, 1994.
8Chaudhuri & Lo (2015) Chaudhuri, A. and Lo, A. Spectral Analysis of stock-return volatility, correlation and beta. In IEEE Signal Processing and Signal Processing Education Workshop , 2015.