Multivariate Fractional Components Analysis

Tobias Hartl; Roland Weigand

arXiv:1812.09149·econ.EM·January 30, 2019

Multivariate Fractional Components Analysis

Tobias Hartl, Roland Weigand

PDF

TL;DR

This paper introduces a new framework for analyzing multivariate fractional time series with diverse cointegration properties, applicable to high-dimensional data, and demonstrates its effectiveness in modeling realized covariance matrices.

Contribution

It develops a novel fractional components analysis method that handles nonstationary processes with varying fractional orders and cointegration strengths in high dimensions.

Findings

01

Orthogonal short- and long-memory components fit realized covariance data well.

02

The proposed method shows competitive out-of-sample performance.

03

Applicable to high-dimensional time series with diverse fractional properties.

Abstract

We propose a setup for fractionally cointegrated time series which is formulated in terms of latent integrated and short-memory components. It accommodates nonstationary processes with different fractional orders and cointegration of different strengths and is applicable in high-dimensional settings. In an application to realized covariance matrices, we find that orthogonal short- and long-memory components provide a reasonable fit and competitive out-of-sample performance compared to several competing methods.

Figures11

Click any figure to enlarge with its caption.

Tables8

Table 1. Table 1: Estimation results for different specifications of the models estimated in section 5.3 . We show the combinations of s j subscript 𝑠 𝑗 s_{j} , j = 0 , … , q 𝑗 0 … 𝑞 j=0,\ldots,q with best values of the BIC for q = 1 𝑞 1 q=1 (above), q = 2 𝑞 2 q=2 (middle) and q = 3 𝑞 3 q=3 (below).

$s_{1}$	$s_{2}$	$s_{3}$	$s_{0}$	log-lik	$d^{(1)}$	$d^{(2)}$	$d^{(3)}$	BIC
12			3	-19572.2	0.368			-17.116
10			4	-19553.8	0.390			-17.106
11			4	-19590.6	0.390			-17.101
12			4	-19620.7	0.383			-17.094
10			5	-19601.8	0.411			-17.087
11			5	-19636.4	0.406			-17.080
2	9		2	-19573.8	0.631	0.338		-17.157
2	10		1	-19538.0	0.596	0.319		-17.156
1	10		2	-19530.9	0.653	0.370		-17.146
2	10		2	-19605.0	0.551	0.303		-17.143
3	9		1	-19546.7	0.619	0.315		-17.139
2	8		3	-19575.9	0.634	0.353		-17.134
2	2	7	2	-19600.7	0.639	0.422	0.304	-17.129
2	3	7	2	-19666.6	0.565	0.417	0.252	-17.122
2	3	6	2	-19607.5	0.634	0.407	0.288	-17.121
2	4	6	2	-19676.5	0.634	0.412	0.234	-17.121
3	3	5	2	-19617.4	0.629	0.398	0.272	-17.119
3	2	6	2	-19601.0	0.618	0.395	0.292	-17.115

Table 2. Table 2: P-values of diagnostic tests for the residuals from the DOFC model ( 11 ) estimated in section 5.3 . We conducted Ljung-Box tests for residual correlation (LB), ARCH-LM tests for conditional heteroskedasticity (CH), each with different lags, and Jarque-Bera tests (JB) for deviations from normality.

	LB5	LB10	LB22	CH5	CH10	CH22	JB
$e_{1, t}$	0.944	0.848	0.101	0.083	0.359	0.373	0.000
$e_{2, t}$	0.022	0.008	0.032	0.000	0.001	0.001	0.000
$e_{3, t}$	0.191	0.110	0.253	0.008	0.018	0.006	0.000
$e_{4, t}$	0.474	0.459	0.109	0.043	0.038	0.152	0.000
$e_{5, t}$	0.000	0.004	0.002	0.000	0.004	0.000	0.000
$e_{6, t}$	0.035	0.197	0.063	0.000	0.000	0.000	0.000
$e_{7, t}$	0.091	0.054	0.178	0.741	0.142	0.382	0.002
$e_{8, t}$	0.071	0.075	0.103	0.587	0.569	0.509	0.000
$e_{9, t}$	0.208	0.365	0.295	0.109	0.280	0.212	0.219
$e_{10, t}$	0.108	0.117	0.459	0.861	0.915	0.717	0.001
$e_{11, t}$	0.326	0.090	0.092	0.207	0.436	0.877	0.000
$e_{12, t}$	0.468	0.477	0.442	0.538	0.033	0.037	0.175
$e_{13, t}$	0.080	0.158	0.800	0.571	0.318	0.060	0.000
$e_{14, t}$	0.235	0.162	0.026	0.080	0.167	0.079	0.000
$e_{15, t}$	0.242	0.328	0.072	0.001	0.011	0.011	0.102
$e_{16, t}$	0.354	0.541	0.589	0.272	0.180	0.367	0.000
$e_{17, t}$	0.000	0.000	0.000	0.057	0.039	0.003	0.369
$e_{18, t}$	0.158	0.376	0.480	0.245	0.326	0.349	0.000
$e_{19, t}$	0.557	0.514	0.849	0.003	0.011	0.019	0.001
$e_{20, t}$	0.685	0.882	0.942	0.412	0.216	0.790	0.000
$e_{21, t}$	0.122	0.014	0.055	0.600	0.446	0.256	0.000

Table 3. Table 3: Estimated parameters along with bootstrap mean and standard errors from bootstrap (SE.boot), sandwich (SE.sand) and information matrix (SE.info) as described in Hartl and Weigand ( 2018 ) for the DOFC model ( 11 ) estimated in section 5.3 .

	Estimate	Mean	SE.boot	SE.sand	SE.info
$d_{1}$	0.6308	0.6361	0.0190	0.0217	0.0178
$d_{2}$	0.3382	0.3334	0.0094	0.0116	0.0086
$ϕ_{1}$	0.2468	0.2360	0.0345	0.0417	0.0348
$ϕ_{2}$	0.0768	0.0636	0.0370	0.0419	0.0402
$h_{1}$	0.2028	0.1844	0.1122	0.1106	0.0759
$h_{2}$	0.3858	0.3727	0.0522	0.0551	0.0321
$h_{3}$	0.3289	0.3309	0.0930	0.0957	0.0714
$h_{4}$	0.1758	0.1638	0.1222	0.1371	0.0861
$h_{5}$	0.7649	0.7618	0.0558	0.0676	0.0482
$h_{6}$	0.2459	0.2413	0.0772	0.0810	0.0588
$h_{7}$	0.0615	0.0611	0.0037	0.0079	0.0027
$h_{8}$	0.0746	0.0739	0.0032	0.0063	0.0026
$h_{9}$	0.0799	0.0793	0.0033	0.0060	0.0027
$h_{10}$	0.0778	0.0771	0.0034	0.0060	0.0028
$h_{11}$	0.0725	0.0718	0.0036	0.0072	0.0030
$h_{12}$	0.0563	0.0557	0.0036	0.0062	0.0025
$h_{13}$	0.0545	0.0543	0.0032	0.0063	0.0026
$h_{14}$	0.0509	0.0505	0.0031	0.0060	0.0025
$h_{15}$	0.0570	0.0564	0.0056	0.0077	0.0045
$h_{16}$	0.0739	0.0733	0.0033	0.0059	0.0029
$h_{17}$	0.0889	0.0880	0.0036	0.0053	0.0032
$h_{18}$	0.0441	0.0438	0.0040	0.0082	0.0030
$h_{19}$	0.0919	0.0910	0.0038	0.0059	0.0035
$h_{20}$	0.0621	0.0615	0.0037	0.0064	0.0031
$h_{21}$	0.0601	0.0595	0.0035	0.0060	0.0032

Table 4. Table 4: Bootstrap t 𝑡 t -ratios for fractional components loadings ( Λ ( 1 ) superscript Λ 1 \varLambda^{(1)} and Λ ( 2 ) superscript Λ 2 \varLambda^{(2)} ) and nonfractional loadings ( Γ Γ \varGamma ) from the DOFC model ( 11 ) estimated in section 5.3 .

	$Λ^{(1)}$		$Λ^{(2)}$									$Γ$
$y_{1, t}$	17.4		11.9									3.6
$y_{2, t}$	22.7	1.6	1.8	-11.9								0.4	-0.7
$y_{3, t}$	13.1	-1.6	3.4	-6.1	-15.6							1.1	1.3
$y_{4, t}$	11.9	-2.9	2.1	-3.3	-5.2	16.2						0.6	-1.6
$y_{5, t}$	12.5	-11.1	4.4	-10.7	-1.1	1.1	-3.4					1.6	-1.7
$y_{6, t}$	18.9	1.2	2.0	-6.0	-2.3	3.1	-3.1	6.8				1.0	1.7
$y_{7, t}$	4.1	6.0	5.6	-8.0	-1.1	2.0	-4.7	-1.5	4.9			-0.5	-2.7
$y_{8, t}$	3.5	3.2	5.4	-10.2	-1.2	2.4	-1.6	0.5	4.4	7.4		0.9	0.2
$y_{9, t}$	4.7	4.2	2.8	-7.4	-1.0	2.2	-5.8	-2.2	-0.1	3.4	-3.0	3.9	-0.5
$y_{10, t}$	2.7	4.9	2.5	-9.0	-2.9	2.0	-3.3	-0.9	2.5	3.6	4.2	6.1	-4.0
$y_{11, t}$	4.0	5.1	3.5	-9.9	0.6	2.7	-2.8	-1.7	7.1	-1.8	-2.5	3.2	1.7
$y_{12, t}$	2.9	3.8	9.7	-9.8	-1.0	2.5	-1.3	0.2	0.5	3.5	3.5	-2.5	3.0
$y_{13, t}$	3.9	4.3	5.5	-7.0	-0.9	2.2	-5.7	-2.7	-3.1	0.6	-1.0	0.2	2.7
$y_{14, t}$	2.0	4.4	5.1	-9.8	-2.6	2.6	-2.3	-0.8	-1.8	-0.8	6.1	2.6	-3.4
$y_{15, t}$	3.4	6.0	6.2	-12.5	0.5	2.5	-1.6	0.6	1.4	-7.2	-0.4	-0.3	3.5
$y_{16, t}$	4.4	3.1	5.6	-10.6	-1.1	2.3	-1.7	-0.4	-2.9	5.4	-2.2	1.2	2.2
$y_{17, t}$	3.2	4.4	5.7	-12.5	-1.5	2.8	0.5	1.7	-0.8	3.9	4.2	3.8	-0.9
$y_{18, t}$	2.5	3.1	6.6	-12.6	0.0	2.8	0.4	0.3	2.3	2.2	1.2	1.5	10.2
$y_{19, t}$	3.7	5.2	3.6	-8.9	-1.9	1.7	-2.6	-0.9	-5.1	2.1	-0.7	6.2	-1.9
$y_{20, t}$	3.7	4.0	3.4	-8.1	0.1	2.6	-4.7	-3.2	-2.4	0.2	-2.2	3.9	6.4
$y_{21, t}$	1.6	4.0	3.1	-9.9	-1.4	3.0	-1.2	-0.5	-0.3	-2.1	4.5	8.0	1.4

Table 5. Table 5: Out of sample risks for h = 1 ℎ 1 h=1 as described in section 5.4 . In different rows, we consider the fractional components (FC) and several benchmark models, namely a diagonal vector ARMA(2,1) and a diagonal vector ARFIMA(1, d 𝑑 d ,1) model, the conditional autoregressive Wishart (CAW) model of Golosnoy et al. ( 2012 ) , a dynamic correlation specification (CAW-DCC) of Bauwens et al. ( 2012 ) , and additive and multiplicative components Wishart models as proposed by Jin and Maheu ( 2013 ) . Asterisks denote the best performing model ( ∗∗∗ ), models in the 80% model confidence set ( ∗∗ ) and additional models in the 90% model confidence set ( ∗ ). As loss functions, we consider the Frobenius norm (LF), the Stein norm (LS), the predictive densities (LD), the minimum-variance portfolio variance (LMV) and the L3-Loss (L3).

$h = 1$	LF		LS		L3		LMV		LD
FC	84.28	^∗∗∗	0.9660	^∗∗∗	1807	^∗∗∗	0.7905	^∗∗∗	8.1319	^∗∗∗
ARMA	85.09	^∗∗	0.9950		1823	^∗∗	0.7916		8.1533
ARFIMA	86.22	^∗∗	0.9955		1829	^∗∗	0.7911	^∗∗	8.1586
ARFIMA.chol	87.82	^∗∗	1.0830		1860	^∗∗	0.7920	^∗	8.1723	^∗∗
CAW.diag	85.97	^∗∗	1.0254		1843	^∗∗	0.7930		8.1869
CAW.dcc	86.32	^∗∗	1.0037		1866	^∗∗	0.7928		8.3021
CAW.acomp	85.77	^∗∗	1.0268		1814	^∗∗	0.7932		8.2482
CAW.mcomp	90.72	^∗∗	1.0301		1904	^∗∗	0.7929		8.2478

Table 6. Table 6: Out of sample risks for h = 5 ℎ 5 h=5 as described in section 5.4 . For details on the abbreviations see table 5 .

$h = 5$	LF		LS		L3		LMV		LD
FC	135.28	^∗∗∗	1.3766	^∗∗∗	2463	^∗∗∗	0.8011	^∗∗∗	8.2490	^∗∗∗
ARMA	134.43	^∗∗	1.4046		2498	^∗∗	0.8025		8.2688
ARFIMA	135.12	^∗∗	1.3974		2492	^∗∗	0.8015	^∗∗	8.2664
ARFIMA.chol	140.43	^∗∗	1.5348		2557	^∗∗	0.8021	^∗	8.3113	^∗∗
CAW.diag	137.34	^∗∗	1.4356		2612	^∗∗	0.8038		8.3184
CAW.dcc	137.77	^∗∗	1.4094		2627	^∗∗	0.8039		8.4150
CAW.acomp	139.22	^∗∗	1.4443		2558	^∗∗	0.8028		8.3311
CAW.mcomp	142.26	^∗∗	1.4489		2590	^∗∗	0.8028		8.3399

Table 7. Table 7: Out of sample risks for h = 10 ℎ 10 h=10 as described in section 5.4 . For details on the abbreviations see table 5 .

$h = 10$	LF		LS		L3		LMV		LD
FC	170.07	^∗∗	1.7033	^∗∗	2837	^∗∗	0.8102	^∗∗	8.3118	^∗∗∗
ARMA	172.03	^∗∗	1.6985	^∗∗	2890	^∗∗	0.8088	^∗	8.3519
ARFIMA	168.55	^∗∗∗	1.6716	^∗∗∗	2837	^∗∗∗	0.8076	^∗∗∗	8.3372
ARFIMA.chol	173.43	^∗∗	1.8455		2900	^∗∗	0.8103	^∗∗	8.3893
CAW.diag	176.50	^∗∗	1.7399	^∗∗	2980	^∗∗	0.8110	^∗∗	8.4105
CAW.dcc	178.11	^∗∗	1.7120	^∗∗	2986	^∗∗	0.8101	^∗∗	8.4973
CAW.acomp	177.37	^∗∗	1.7366	^∗∗	2947	^∗∗	0.8093	^∗∗	8.4146
CAW.mcomp	181.04	^∗∗	1.7265	^∗∗	3009	^∗∗	0.8096	^∗∗	8.4248

Table 8. Table 8: Out of sample risks for h = 20 ℎ 20 h=20 as described in section 5.4 . For details on the abbreviations see table 5 .

$h = 20$	LF		LS		L3		LMV		LD
FC	199.42	^∗∗∗	2.0461	^∗∗	3144	^∗∗∗	0.8225	^∗∗	8.3778	^∗∗∗
ARMA	208.07	^∗∗	2.0980	^∗∗	3231		0.8224	^∗∗	8.4314
ARFIMA	200.33	^∗∗	2.0305	^∗∗∗	3162	^∗∗	0.8209	^∗∗∗	8.4049
ARFIMA.chol	203.22	^∗∗	2.1910	^∗∗	3201	^∗∗	0.8219	^∗∗	8.4738
CAW.diag	214.60	^∗	2.1580	^∗∗	3326	^∗	0.8241	^∗∗	8.5034
CAW.dcc	217.52		2.1698	^∗∗	3331		0.8231	^∗∗	8.6158
CAW.acomp	211.33		2.1165	^∗∗	3289	^∗	0.8214	^∗∗	8.5028
CAW.mcomp	209.78	^∗∗	2.0858	^∗∗	3282	^∗∗	0.8225	^∗∗	8.5158

Equations87

y_{t} = Λ x_{t} + u_{t}, t = 1, \dots, n .

y_{t} = Λ x_{t} + u_{t}, t = 1, \dots, n .

Δ^{d_{j}} x_{j t} = ξ_{j t}, j = 1, \dots, s .

Δ^{d_{j}} x_{j t} = ξ_{j t}, j = 1, \dots, s .

Δ^{d} = (1 - L)^{d} = j = 0 \sum \infty π_{j} (d) L^{j}, π_{0} (d) = 1, π_{j} (d) = \frac{j - 1 - d}{j} π_{j - 1} (d), j \geq 1,

Δ^{d} = (1 - L)^{d} = j = 0 \sum \infty π_{j} (d) L^{j}, π_{0} (d) = 1, π_{j} (d) = \frac{j - 1 - d}{j} π_{j - 1} (d), j \geq 1,

x_{j t} = Δ_{+}^{- d_{j}} ξ_{j t} = i = 0 \sum t - 1 π_{i} (- d_{j}) ξ_{j, t - i}, j = 1, \dots, s .

x_{j t} = Δ_{+}^{- d_{j}} ξ_{j t} = i = 0 \sum t - 1 π_{i} (- d_{j}) ξ_{j, t - i}, j = 1, \dots, s .

Φ (L) u_{t} = Θ (L) e_{t}, t = 1, \dots, n,

Φ (L) u_{t} = Θ (L) e_{t}, t = 1, \dots, n,

ξ_{t} \sim N I D (0, Σ_{ξ}), e_{t} \sim N I D (0, Σ_{e}) and E (ξ_{t} e_{t}^{'}) = Σ_{ξ e},

ξ_{t} \sim N I D (0, Σ_{ξ}), e_{t} \sim N I D (0, Σ_{e}) and E (ξ_{t} e_{t}^{'}) = Σ_{ξ e},

y_{t} = Λ^{(1)} x_{t}^{(1)} + \dots + Λ^{(q)} x_{t}^{(q)} + u_{t} .

y_{t} = Λ^{(1)} x_{t}^{(1)} + \dots + Λ^{(q)} x_{t}^{(q)} + u_{t} .

Λ_{⊥}^{(1)^{'}} y_{t} = Λ_{⊥}^{(1)^{'}} Λ^{(2)} x_{t}^{(2)} + \dots + Λ_{⊥}^{(1)^{'}} Λ^{(q)} x_{t}^{(q)} + Λ_{⊥}^{(1)^{'}} u_{t}

Λ_{⊥}^{(1)^{'}} y_{t} = Λ_{⊥}^{(1)^{'}} Λ^{(2)} x_{t}^{(2)} + \dots + Λ_{⊥}^{(1)^{'}} Λ^{(q)} x_{t}^{(q)} + Λ_{⊥}^{(1)^{'}} u_{t}

Δ^{d^{(1)}} y_{t} = α β^{'} L_{d^{(1)} - d^{(2)}} Δ^{d^{(2)}} y_{t} + κ_{t},

Δ^{d^{(1)}} y_{t} = α β^{'} L_{d^{(1)} - d^{(2)}} Δ^{d^{(2)}} y_{t} + κ_{t},

κ_{t} := M (Λ^{(1)} ξ_{t}^{(1)} + Δ^{d^{(1)}} u_{t}) - α β^{'} (Λ^{(2)} ξ_{t}^{(2)} + Δ^{d^{(2)}} u_{t})

κ_{t} := M (Λ^{(1)} ξ_{t}^{(1)} + Δ^{d^{(1)}} u_{t}) - α β^{'} (Λ^{(2)} ξ_{t}^{(2)} + Δ^{d^{(2)}} u_{t})

(1 - ϕ L) Δ^{d} y_{t} = (1 - ϕ L) ξ_{t} + Δ^{d} e_{t} .

(1 - ϕ L) Δ^{d} y_{t} = (1 - ϕ L) ξ_{t} + Δ^{d} e_{t} .

Δ^{d_{1}} y_{t}^{(1)} = Λ^{(1, 1)} ξ_{t}^{(1)} + Λ^{(1, 2)} Δ^{d_{1} - d_{2}} ξ_{t}^{(2)} + \dots + Λ^{(1, q)} Δ^{d_{1} - d_{q}} ξ_{t}^{(q)} + Δ^{d_{1}} u_{t} (:= ω_{t}^{(1)}),

Δ^{d_{1}} y_{t}^{(1)} = Λ^{(1, 1)} ξ_{t}^{(1)} + Λ^{(1, 2)} Δ^{d_{1} - d_{2}} ξ_{t}^{(2)} + \dots + Λ^{(1, q)} Δ^{d_{1} - d_{q}} ξ_{t}^{(q)} + Δ^{d_{1}} u_{t} (:= ω_{t}^{(1)}),

Δ^{d^{(j)}} y_{t}^{(j)}

Δ^{d^{(j)}} y_{t}^{(j)}

= - B^{(j, 1)} Δ^{d^{(j)}} y_{t}^{(1)} - \dots - B^{(j, j - 1)} Δ^{d^{(j)}} y_{t}^{(j - 1)} + ω_{t}^{(j)},

B y_{t} = (Δ_{+}^{- d_{1}} ω_{t}^{(1)^{'}}, \dots, Δ_{+}^{- d_{q}} ω_{t}^{(q)^{'}})^{'},

B y_{t} = (Δ_{+}^{- d_{1}} ω_{t}^{(1)^{'}}, \dots, Δ_{+}^{- d_{q}} ω_{t}^{(q)^{'}})^{'},

y_{t} = Λ^{(1)} x_{t}^{(1)} + \dots + Λ^{(q)} x_{t}^{(q)} + Γ z_{t} + ε_{t},

y_{t} = Λ^{(1)} x_{t}^{(1)} + \dots + Λ^{(q)} x_{t}^{(q)} + Γ z_{t} + ε_{t},

(1 - ϕ_{j 1} L - \dots - ϕ_{j k} L^{k}) z_{j t} = ζ_{j t}, j = 1, \dots, s_{0},

(1 - ϕ_{j 1} L - \dots - ϕ_{j k} L^{k}) z_{j t} = ζ_{j t}, j = 1, \dots, s_{0},

ξ_{t} \sim N I D (0, I), ζ_{t} \sim N I D (0, I) and ε_{t} \sim N I D (0, H),

ξ_{t} \sim N I D (0, I), ζ_{t} \sim N I D (0, I) and ε_{t} \sim N I D (0, H),

y_{t} = (lo g (X_{11, t}), \dots, lo g (X_{66, t}), Z_{21, t}, Z_{31, t}, \dots, Z_{65, t})^{'},

y_{t} = (lo g (X_{11, t}), \dots, lo g (X_{66, t}), Z_{21, t}, Z_{31, t}, \dots, Z_{65, t})^{'},

Z_{ij, t} = 0.5 [lo g (1 + R_{ij, t}) - lo g (1 - R_{ij, t})], R_{ij, t} = \frac{X _{ij, t}}{X _{ii, t} X _{j j, t}} .

Z_{ij, t} = 0.5 [lo g (1 + R_{ij, t}) - lo g (1 - R_{ij, t})], R_{ij, t} = \frac{X _{ij, t}}{X _{ii, t} X _{j j, t}} .

y_{t} = Λ^{(1)} x_{t}^{(1)} + Λ^{(2)} x_{t}^{(2)} + u_{t},

y_{t} = Λ^{(1)} x_{t}^{(1)} + Λ^{(2)} x_{t}^{(2)} + u_{t},

Λ_{⊥}^{(1)^{'}} Δ^{d^{(2)}} y_{t} = Λ_{⊥}^{(1)^{'}} Λ^{(2)} ξ_{t}^{(2)} + Λ_{⊥}^{(1)^{'}} Δ^{d^{(2)}} u_{t} and Λ_{⊥}^{(2)^{'}} Δ^{d^{(1)}} y_{t} = Λ_{⊥}^{(2)^{'}} Λ^{(1)} ξ_{t}^{(1)} + Λ_{⊥}^{(2)^{'}} Δ^{d^{(1)}} u_{t} .

Λ_{⊥}^{(1)^{'}} Δ^{d^{(2)}} y_{t} = Λ_{⊥}^{(1)^{'}} Λ^{(2)} ξ_{t}^{(2)} + Λ_{⊥}^{(1)^{'}} Δ^{d^{(2)}} u_{t} and Λ_{⊥}^{(2)^{'}} Δ^{d^{(1)}} y_{t} = Λ_{⊥}^{(2)^{'}} Λ^{(1)} ξ_{t}^{(1)} + Λ_{⊥}^{(2)^{'}} Δ^{d^{(1)}} u_{t} .

N := Λ^{(2)} (Λ_{⊥}^{(1)^{'}} Λ^{(2)})^{- 1} Λ_{⊥}^{(1)^{'}} and M := Λ^{(1)} (Λ_{⊥}^{(2)^{'}} Λ^{(1)})^{- 1} Λ_{⊥}^{(2)^{'}},

N := Λ^{(2)} (Λ_{⊥}^{(1)^{'}} Λ^{(2)})^{- 1} Λ_{⊥}^{(1)^{'}} and M := Λ^{(1)} (Λ_{⊥}^{(2)^{'}} Λ^{(1)})^{- 1} Λ_{⊥}^{(2)^{'}},

Δ^{d^{(1)}} y_{t} = M (Λ^{(1)} ξ_{t}^{(1)} + Δ^{d^{(1)}} u_{t}) + Δ^{d^{(1)} - d^{(2)}} Δ^{d^{(2)}} N y_{t} .

Δ^{d^{(1)}} y_{t} = M (Λ^{(1)} ξ_{t}^{(1)} + Δ^{d^{(1)}} u_{t}) + Δ^{d^{(1)} - d^{(2)}} Δ^{d^{(2)}} N y_{t} .

Δ^{d^{(j)}} y_{t}^{(j)} = Λ^{(j, 1)} Δ^{d^{(j)}} x_{t}^{(1)} + \dots + Λ^{(j, q)} Δ^{d^{(j)}} x_{t}^{(q)} + Δ^{d^{(j)}} u_{t}^{(j)} .

Δ^{d^{(j)}} y_{t}^{(j)} = Λ^{(j, 1)} Δ^{d^{(j)}} x_{t}^{(1)} + \dots + Λ^{(j, q)} Δ^{d^{(j)}} x_{t}^{(q)} + Δ^{d^{(j)}} u_{t}^{(j)} .

Δ^{d^{(j)}} y_{t}^{(j)}

Δ^{d^{(j)}} y_{t}^{(j)}

= Λ^{(j, 1 : (j - 1))} Δ^{d^{(j)}} x_{t}^{(1 : (j - 1))} + \tilde{ω}_{t}^{j},

Δ^{d^{(j)}} y_{t}^{(1 : (j - 1))} = Λ^{(1 : (j - 1), 1 : (j - 1))} Δ^{d^{(j)}} x_{t}^{(1 : (j - 1))} + \overset{ω}{ˇ}_{t}^{j},

Δ^{d^{(j)}} y_{t}^{(1 : (j - 1))} = Λ^{(1 : (j - 1), 1 : (j - 1))} Δ^{d^{(j)}} x_{t}^{(1 : (j - 1))} + \overset{ω}{ˇ}_{t}^{j},

Δ^{d^{(j)}} x_{t}^{(1 : (j - 1))} = (Λ^{(1 : (j - 1), 1 : (j - 1))})^{- 1} Δ^{d^{(j)}} y_{t}^{(1 : (j - 1))} - (Λ^{(1 : (j - 1), 1 : (j - 1))})^{- 1} \overset{ω}{ˇ}_{t}^{j} .

Δ^{d^{(j)}} x_{t}^{(1 : (j - 1))} = (Λ^{(1 : (j - 1), 1 : (j - 1))})^{- 1} Δ^{d^{(j)}} y_{t}^{(1 : (j - 1))} - (Λ^{(1 : (j - 1), 1 : (j - 1))})^{- 1} \overset{ω}{ˇ}_{t}^{j} .

ω_{t}^{(j)} = \tilde{ω}_{t}^{j} - Λ^{(j, 1 : (j - 1))} (Λ^{(1 : (j - 1), 1 : (j - 1))})^{- 1} \overset{ω}{ˇ}_{t}^{j},

ω_{t}^{(j)} = \tilde{ω}_{t}^{j} - Λ^{(j, 1 : (j - 1))} (Λ^{(1 : (j - 1), 1 : (j - 1))})^{- 1} \overset{ω}{ˇ}_{t}^{j},

ω_{t}^{(j)} = - Λ^{(j, 1 : (j - 1))} (Λ^{(1 : (j - 1), 1 : (j - 1))})^{- 1} Λ^{(1 : (j - 1), j : q)} Δ^{d^{(j)}} x_{t}^{(j : q)} - Λ^{(j, 1 : (j - 1))} (Λ^{(1 : (j - 1), 1 : (j - 1))})^{- 1} Δ^{d^{(j)}} u_{t}^{(1 : (j - 1))} + Λ^{(j, j : q)} Δ^{d^{(j)}} x_{t}^{(j : q)} + Δ^{d^{(j)}} u_{t}^{(j)} .

ω_{t}^{(j)} = - Λ^{(j, 1 : (j - 1))} (Λ^{(1 : (j - 1), 1 : (j - 1))})^{- 1} Λ^{(1 : (j - 1), j : q)} Δ^{d^{(j)}} x_{t}^{(j : q)} - Λ^{(j, 1 : (j - 1))} (Λ^{(1 : (j - 1), 1 : (j - 1))})^{- 1} Δ^{d^{(j)}} u_{t}^{(1 : (j - 1))} + Λ^{(j, j : q)} Δ^{d^{(j)}} x_{t}^{(j : q)} + Δ^{d^{(j)}} u_{t}^{(j)} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Multivariate Fractional Components Analysis

Tobias Hartl

University of Regensburg, 93053 Regensburg, Germany

Institute for Employment Research (IAB), 90478 Nuremberg, Germany

Roland Weigand111Corresponding author. E-Mail: [email protected]

AOK Bayern, 93055 Regensburg, Germany

(January 2019)

Abstract.

We propose a setup for fractionally cointegrated time series which is formulated in terms of latent integrated and short-memory components. It accommodates nonstationary processes with different fractional orders and cointegration of different strengths and is applicable in high-dimensional settings. In an application to realized covariance matrices, we find that orthogonal short- and long-memory components provide a reasonable fit and competitive out-of-sample performance compared to several competing methods.

Keywords.

Long memory, fractional cointegration, state space, unobserved components, factor model, realized covariance matrix.

JEL-Classification.

C32, C51, C53, C58.

1 Introduction

Multivariate fractional integration and cointegration models have proven valuable in a wide range of empirical applications from macroeconomics and finance. They generalize the standard concept of cointegration by allowing for non-integer orders of integration both for the observations and for equilibrium errors; see Gil-Alana and Hualde (2008) for a literature review. In the field of macroeconomics, such models have turned out to be relevant in analyses of purchasing power parity beginning with Cheung and Lai (1993), of the relation between unemployment and input prices (Caporale and Gil-Alana, 2002) and of broader models for economic fluctuations (Morana, 2006). The empirical finance literature has considered fractional cointegration, e.g., for analysing international bond returns (Dueker and Startz, 1998), for modeling co-movements of stock return volatilities (Beltratti and Morana, 2006), for assessing the link between realized and implied volatility (Nielsen, 2007) and for quantifying risk in strategic asset allocation problems (Schotman et al., 2008). From a methodological point of view, semiparametric techniques for inference on the cointegration rank, the cointegration space and memory parameters have been very popular among empirical researchers, although the development of optimal parametric inferential methods for models with triangular or fractional vector error correction representations has recently made considerable progress (see, e.g., Robinson and Hualde, 2003; Avarucci and Velasco, 2009; Lasak, 2010; Johansen and Nielsen, 2012).

Despite their flexibility and their computationally simple treatment, semiparametric models are limited in scope since they aim to describe low-frequency properties only and are hence not appropriate for impulse response analysis and forecasting. While semiparametric techniques have been developed to cope with multivariate processes of different integration orders and multiple fractional cointegration relations of different strenghts (Chen and Hurvich, 2006; Hualde and Robinson, 2010; Hualde, 2009), there seems to be a lack of parametric models of such generality. Furthermore, the usual error correction and triangular models with their abundant parametrization are not deemed appropriate for time series of dimension, say, larger than five.

In this paper, we propose new models for multivariate fractionally integrated and cointegrated time series which are formulated in terms of latent purely fractional and additive short-memory components. With a “type II” definition of fractional integration (Robinson, 2005), this approach allows for a flexible modeling of possibly nonstationary time series of different fractional integration orders. It permits cointegration relations of different strengths as well as polynomial cointegration (multicointegration in the terminology of Granger and Lee, 1989), i.e., cointegration between the levels of some time series and their (fractional) differences, and guarantees a clear representation of the long-run characteristics. Consequently, our model is among the most general setups regarding its integration and cointegration properties, compared to popular existing models for cointegrated processes. The unobserved components formulation benefits the modeling of relatively high-dimensional time series. For this situation we propose a parsimonious parametrization based on dimension reduction and dynamic orthogonal components in the spirit of Pan and Yao (2008) and Matteson and Tsay (2011).

In contrast to our parametric approach, latent fractional components have mostly been studied in semiparametric frameworks. Ray and Tsay (2000) use semiparametric memory estimators and canonical correlations to infer the existence of common fractional components, Morana (2004) proposes a frequency domain principal components estimator, Morana (2007) estimates components of a single fractional integration order by univariate permanent-transitory (or persistent-transitory) decompositions followed by a principal component analysis of the permanent (or persistent) components and Luciani and Veredas (2015) estimate their fractional factor model by fitting long-memory models to the principal components of a large panel of time series. In a setup closest to ours, Chen and Hurvich (2006) suggest a semiparametric frequency domain methodology to identify and estimate cointegration subspaces which annihilate fractional components of different memory.

Recent parametric frameworks competing to ours have either been much more restrictive, or have a different focus, e.g., on numerical simulation-based estimation methods, or on panel data analysis. Using a Bayesian approach, Hsu et al. (1998) discuss a bivariate process sharing one stationary long-memory component, while Mesters et al. (2016) consider simulated maximum likelihood estimation of models with one or more latent stationary ARFIMA components. On the other hand, Ergemen and Velasco (2017) as well as Ergemen (2017) focus on the elimination of common fractional components and are motivated as alternatives to prior unit root testing. Contrary to our approach, they eliminate the common factor structure, which they treat as nuisance.

As the second main contribution of this paper, our model is applied to forecasting daily realized covariance matrices. In this setup, the strengths of our approach become apparent. In realized covariance modelling, typically high-dimensional processes with strong persistence and a pronounced co-movement in the low-frequency dynamics are considered. In time series of log variances and z-transformed correlations for six US stocks, we find that common orthogonal short- and long-memory components with two different fractional integration orders provide a reasonable fit. Since the dimension of the dataset is reduced to a smaller number of latent processes, our model becomes a factor model. A pseudo out-of-sample study shows that the fractional components model provides a superior forecasting accuracy compared to several competitor methods. In addition to the favorable forecast properties, our methods can be applied to study the cointegration properties of stock market volatilities. These are of particular importance for longer-term portfolio hedging and the analysis of systematic risk.

The paper is organized as follows. Section 2 introduces the general setup and clarifies its integration and cointegration properties. In section 3, its relation to existing models for multivariate integrated time series is discussed. In section 4, a specific model appropriate for relatively high-dimensional processes is considered. The empirical application to realized covariance matrices and a pseudo out-of-sample assessment are contained in section 5 before section 6 concludes.

2 The general setup

We consider a linear model for a $p$ -dimensional observed time series $y_{t}$ , which we label a fractional components (FC) setup,

[TABLE]

The model is formulated in terms of the latent processes $x_{t}$ and $u_{t}$ where $\varLambda$ will always be assumed to have full column rank and the components of the $s$ -dimensional $x_{t}$ are fractionally integrated noise according to

[TABLE]

In principle, $s>p$ is possible, but we only consider cases where $s\leq p$ here. For a generic scalar $d$ , the fractional difference operator is defined by

[TABLE]

where $L$ denotes the lag or backshift operator, $Lx_{t}=x_{t-1}$ . We adapt a nonstationary type II solution of these processes (Robinson, 2005) and hence treat $d_{j}\geq 0.5$ alongside the asymptotically stationary case $d_{j}<0.5$ in a continuous setup, by setting starting values to zero, $x_{jt}=0$ for $t\leq 0$ . Nonzero initial values have been considered for observed fractional processes by Johansen and Nielsen (2012), but are not straightforwardly handled for our unobserved processes. The solution is based on the truncated operator $\Delta_{+}^{-d_{j}}$ (Johansen, 2008) and given by

[TABLE]

Without loss of generality let the components be arranged such that $d_{1}\geq\ldots\geq d_{s}$ .

We assume $d_{j}>0$ for all $j$ in what follows, so that $x_{t}$ governs the long-term characteristics of the observations $y_{t}$ . These are complemented by additive short-run dynamics which we describe by stationary vector ARMA specifications for $u_{t}$ in the general case. This ARMA process is given by

[TABLE]

where $\varPhi(L)$ and $\varTheta(L)$ are a stable vector autoregressive polynomial and an invertible moving average polynomial, respectively. The disturbances $\xi_{t}$ and $e_{t}$ jointly follow a Gaussian white noise ( $N\!I\!D$ ) sequence such that

[TABLE]

where at this stage, before turning to identified and empirically relevant model specifications below, we do not consider restrictions on the joint covariance matrix, but only require $\varSigma_{\xi}$ to have strictly positive entries on the main diagonal.

Some remarks regarding the general FC setup are in order. The model as given in (1) is not identified without further restrictions on the loading matrix $\varLambda$ , on the vector ARMA coefficients and on the noise covariance matrix. While restrictions on $\varSigma_{\xi}$ and $\varLambda$ may be based on results in dynamic factor analysis as will be seen below, choosing specific parametrizations for $u_{t}$ will depend on characteristics of the data and on the purpose of the empirical analysis. Identified vector ARMA structures like the echelon form (see Lütkepohl, 2005, chapter 12) can be used for a rich parametrization, while a multivariate structural time series approach as described in Harvey (1991) integrates nicely with the unobserved components framework considered in this paper and allows for more restricted parameterization, e.g., by individual or common stochastic cycle components. Below, we introduce a parsimonious model well-suited to relatively high dimensions which is conceptually based on dimension reduction and orthogonal components.

For a characterization of the integration and cointegration properties of our model, we adapt the definitions of these concepts from Hualde and Robinson (2010), which prove useful here. Hence, a generic scalar process $\rho_{t}$ is called integrated of order $\delta$ or $I(\delta)$ if it can be written as $\rho_{t}=\sum_{i=1}^{l}\Delta_{+}^{-\delta_{i}}\nu_{it}$ , where $\delta=\max_{i=1,\ldots,l}\{\delta_{i}\}$ and $\nu_{t}=(\nu_{1t},\ldots,\nu_{lt})^{\prime}$ is a finite-dimensional covariance stationary process with spectral density matrix which is continuous and nonsingular at all frequencies. A vector process $\tau_{t}$ is called $I(\delta)$ if $\delta$ is the maximum integration order of its components. We call the process $\tau_{t}$ cointegrated if there exists a nonzero vector $\beta$ such that $\beta^{\prime}\tau_{t}$ is $I(\gamma)$ where $\delta-\gamma>0$ will be referred to as the strength of the cointegration relation. The number of linearly independent cointegration relations with possibly differing $\gamma$ is called cointegration rank of $\tau_{t}$ .

By these definitions, $x_{jt}$ is clearly $I(d_{j})$ while both $x_{t}$ and $y_{t}$ are integrated of order $d_{1}$ . We observe at least two different integration orders in the individual series of $y_{t}$ whenever $\varLambda_{i1}=0$ for some $i$ and $d_{1}>d_{2}$ . More generally, $y_{it}\sim I(d_{j})$ , if $\varLambda_{i1}=\ldots=\varLambda_{i,j-1}=0$ but $\varLambda_{ij}\neq 0$ .

To state the cointegration properties of the FC setup (1), we assume that $s\leq p$ , so that all fractional components are reflected by the integration and cointegration structure of $y_{t}$ and that $\varSigma_{\xi}$ is nonsingular. It is useful to identify all $q$ groups of $x_{jt}$ with identical integration orders and denote their respective sizes by $s_{1}$ , …, $s_{q}$ , such that $d_{s_{1}+\ldots+s_{j-1}+1}=\ldots=d_{s_{1}+\ldots+s_{j}}$ and $s=\sum_{j=1}^{q}s_{j}$ . Of course, if $q=s$ , then $s_{1}=\ldots=s_{q}=1$ and all components of $x_{t}$ have mutually different integration orders, while for $q=1$ it holds that $s=s_{1}$ and we observe $d_{1}=\ldots=d_{s}$ .

To keep notation simple, for a generic matrix $A$ for which a specific grouping of rows and columns is clear from the context, we denote by $A^{(i,j)}$ the block from intersecting the $i$ -th group of rows with the $j$ -th group of columns. A stacking of several groups of rows $i,\ldots,j$ and columns $k,\ldots,l$ is indicated by $A^{(i:j,k:l)}$ . For a grouping in only one dimension we write $A^{(i)}$ or $A^{(i:j)}$ , where it shall be clear from the context whether a grouping of rows or columns is considered. Furthermore, we denote the column space of a generic $k\times l$ matrix $A$ by $sp(A)\subseteq\mathds{R}^{k}$ and its orthogonal complement by $sp^{\bot}(A)$ . Further, for $k>l$ , the $k\times(k-l)$ orthogonal complement of $A$ will be denoted by $A_{\bot}$ , which spans the $(k-l)$ -dimensional space $sp^{\bot}(A)$ .

According to the grouping of equal individual integration orders in $x_{t}$ , we may therefore rewrite the FC process (1) as

[TABLE]

Here, $\varLambda^{(j)}$ is a $p\times s_{j}$ submatrix of $\varLambda$ consisting of columns $\varLambda_{\cdot i}$ for which $s_{1}+\ldots+s_{j-1}<i\leq s_{1}+\ldots+s_{j}$ , and $x_{t}^{(j)}$ is a $s_{j}$ -dimensional subprocess of $x_{t}$ corresponding to components with memory parameter $d^{(j)}:=d_{s_{1}+\ldots+s_{j-1}+1}=\ldots=d_{s_{1}+\ldots+s_{j}}$ . Whenever $s_{1}<p$ , there exist $p-s_{1}$ linearly independent linear combinations $\beta_{i}^{\prime}y_{t}\sim I(\gamma_{i})$ and $\gamma_{i}<d_{1}$ , so that fractional cointegration occurs. Due to our definition of cointegration, this may be a trivial case where a single component $y_{it}$ with integration order smaller than $d_{1}$ is selected. Since

[TABLE]

is integrated of order $d^{(2)}$ , the columns of $\varLambda^{(1)}_{\bot}$ qualify as cointegration vectors and ${\cal S}^{(1)}:=sp^{\bot}(\varLambda^{(1)})$ is the $(p-s_{1})$ -dimensional cointegration space of $y_{t}$ .

Whenever $s_{1}+s_{2}<p$ , there are subspaces of ${\cal S}^{(1)}$ forcing a stronger reduction in integration orders. More generally, it holds that $\varLambda_{\bot}^{(1:j)^{\prime}}y_{t}\sim I(d^{(j+1)})$ whenever $\sum_{i=1}^{j}s_{i}<p$ and where we set $d^{(j+1)}=0$ for $j>s$ . Analogously to Hualde and Robinson (2010), for $s=p$ and $j=1,\ldots,q-1$ , we call ${\cal S}^{(j)}:=sp^{\bot}(\varLambda^{(1:j)})$ the $j$ -th cointegration subspace of $y_{t}$ , for which ${\cal S}^{(q-1)}\subset\ldots\subset{\cal S}^{(1)}$ . For $p>s$ , ${\cal S}^{(q)}\subset{\cal S}^{(q-1)}$ is a further such subspace. Cointegration vectors in ${\cal S}^{(q)}$ cancel all fractional components and hence reduce the integration order from $d_{1}$ to zero, the strongest reduction possible in our setup.

Besides this general pattern of cointegration relations, our model features an interesting special case with so-called polynomial cointegration, that is, cointegration relations where lagged observations nontrivially enter a cointegration relation. To see this possibility, consider a bivariate example similar to Granger and Lee (1989), where $q=p=2$ and $\xi_{1t}=\xi_{2t}$ , so that $\varSigma_{\xi}$ is singular and $x_{2t}=\Delta^{d_{1}-d_{2}}x_{1t}$ . Augmenting the variables by a fractional difference as $\tilde{y}_{t}:=(y_{1t},y_{2t},\Delta^{d_{1}-d_{2}}y_{2t})^{\prime}$ , we obtain a three-dimensional system where levels of $y_{t}$ enter a nontrivial cointegration relation with a fractional difference to achieve a reduction in integration order from $d_{1}$ to $\max\{2d_{2}-d_{1},0\}<d_{2}$ . Hence, our setup complements the model of Johansen (2008, section 4), which was the first to handle polynomial cointegration in a fractional setup, and the results of Carlini and Santucci de Magistris (2018), who derive a Granger representation for the fractional VECM of Granger (1986) under polynomial cointegration.

3 Relations to other cointegration models

In this section, we clarify the relation of the fractional components model (1) to popular existing representations for cointegrated processes and show how our model can be represented in alternative ways brought forward in the literature. While our model is among the most general setups with respect to its integration and cointegration properties, the additive modeling of short-run dynamics is new to the literature and gives rise to distinct parametrizations not possible within other representations in a similarly convenient way.

Error correction models.

The most popular representation of cointegrated systems in the $I(1)$ setting is the vector error correction form. Since an early mention by Granger (1986), in the fractionally integrated case, e.g., Avarucci and Velasco (2009), Lasak (2010) and Johansen and Nielsen (2012) have recently considered such models. In terms of the integration and cointegration properties, the fractional error correction setups are typically restricted to the special case with $q=2$ and $s=p$ , such that the observed variables are integrated of order $d^{(1)}$ and there exist $p-s_{1}$ cointegration relations with errors of order $d^{(2)}$ .

Defining the fractional lag operator $L_{b}:=1-\Delta^{b}$ (Johansen, 2008), we are able to derive the error correction representation for this special case of our model; see appendix A. It is given by

[TABLE]

where we find $\alpha\beta^{\prime}=-\varLambda^{(2)}(\varLambda^{(1)^{\prime}}_{\bot}\varLambda^{(2)})^{-1}\varLambda^{(1)^{\prime}}_{\bot}$ to precede the error correction term, while

[TABLE]

is integrated of order zero and $M$ is defined in (13).

The model differs both from the models of Avarucci and Velasco (2009) and from the representation of Johansen (2008) in the way short-run dynamics are modeled. The literature has considered (fractional) lags of differenced variables and possibly of error correction terms in the VECM representation. Our setup, in contrast, generates autocorrelated $\kappa_{t}$ by filtering the latent $u_{t}$ with fractional difference operators. Hence, adding lags of $\Delta^{d^{(1)}}y_{t}$ in the model (6) is only an approximate solution and achieving a desired approximation quality may require estimating a large number of parameters.

As we have discussed above, Johansen (2008) proposes a polynomially cointegrated generalization of his $\mathrm{VAR}_{d,b}$ model which allows terms integrated of orders $d$ , $d-b$ and $d-2b$ in the Granger representation (Johansen, 2008, theorem 9). Even compared to that specification, our model allows for more general patterns of integration orders and cointegration strengths, since we only assume $d_{j}>0$ for all $j$ . More in line with the generality envisaged in this paper, Tschernig et al. (2013, equation 14) present a model with error correction term and different integration orders, while Lasak and Velasco (2014) sequentially fit error correction models to test for cointegration relations of possibly different strengths.

Vector ARFIMA.

An interesting special case of (1) occurs for $s=p$ and $\varLambda=I$ , where each series in $y_{it}$ is driven by a single fractional component and $y_{it}\sim I(d_{i})$ . This resembles standard vector ARFIMA models with possibly different integration orders; see, e.g., Lobato (1997) who labels the popularly termed vector ARFIMA class considered here as “model A”. A frequently used submodel is the fractionally integrated vector autoregressive model discussed by Nielsen (2004). The main difference to these approaches is our additive modeling of short-run dynamics, whereas in the vector ARFIMA setup weakly dependent vector ARMA instead of white noise processes are passed through the fractional integration filters.

Our model belongs to the class of vector ARFIMA processes for integer $d_{j}\in\{1,2,\ldots\}$ , but not for general fractional integration orders. For the case of integer $d_{j}$ , note that $(x_{t}^{\prime},u_{t}^{\prime})^{\prime}$ is a finite-order vector ARMA process, and hence $y_{t}$ as a linear combination is itself in the ARMA class; see Lütkepohl (1984). For general vector ARFIMA processes, a similar conclusion does not hold. To see this, consider a stylized univariate case of our model with $p=s=1$ , where $\Delta^{d}x_{t}=\xi_{t}$ and $(1-\phi L)u_{t}=e_{t}$ . First note that $(\Delta^{d}x_{t},u_{t})^{\prime}$ has an ARMA structure, and hence $(x_{t},u_{t})^{\prime}$ is a vector ARFIMA process. Expanding $(1-\phi L)\Delta^{d}x_{t}=(1-\phi L)\xi_{t}$ and $(1-\phi L)\Delta^{d}u_{t}=\Delta^{d}e_{t}$ , we can write the sum, belonging to the fractional components model class, as

[TABLE]

The right hand side of this expression is not a finite-order MA process in general, as it has nonzero autocorrelations for all lags, and hence, the process does not belong to the ARFIMA class for non-integer $d$ .

Triangular representations.

The models discussed so far have restricted integration or cointegration properties as compared to our model. Even in the most general setup of Johansen (2008), the integration orders are restricted to be $d$ , $d-b$ , $d-2b$ for polynomial cointegration. In contrast, Hualde (2009) and Hualde and Robinson (2010) have proposed a very flexible model which adapts the triangular form of Phillips (1991) and its generalization to processes with multiple unit roots (Stock and Watson, 1993) to the fractional cointegration setup.

To derive the triangular representation for our model, we assume that the variables in $y_{t}$ are ordered in a way that $\varLambda^{(1:j,1:j)}$ is nonsingular for $j=1,\ldots,q$ and restrict attention to the case $s=p$ for notational convenience. The variables are partitioned according to the groups of different integration orders in $x_{t}$ as $y_{t}^{(j)}:=(y_{s_{1}+\ldots+s_{j-1}+1},\ldots,y_{s_{1}+\ldots+s_{j}})^{\prime}$ , $j=1,\ldots,q$ . The first block in the triangular system is

[TABLE]

where $\omega_{t}^{(1)}$ is integrated of order zero. The general expression for the $j$ -th block of the triangular system is derived in appendix A for $j=2,\ldots,q$ , and given by

[TABLE]

where also $\omega_{t}^{(j)}$ is integrated of order zero for $j=2,\ldots,q$ . By inverting the fractional difference operators we obtain

[TABLE]

where $B$ has a block triangular structure such that $B^{(i,i)}=I$ and $B^{(i,j)}=0$ for $i<j$ . A re-ordering of the variables in $y_{t}$ yields the representation of Hualde and Robinson (2010).

This representation allows for a semiparametric cointegration analysis of our model using the methods of Hualde (2009) and Hualde and Robinson (2010). However, our model differs significantly from straightforward parametrizations of the triangular system, e.g., from assuming a vector ARMA process for $\omega_{t}$ , since in our setup $\omega_{t}$ as stated in (16) generally contains fractional differences that cannot be represented within the ARMA framework.

State space approaches.

Bauer and Wagner (2012) have presented a state space canonical form for multiple frequency unit root processes of different (integer-valued) integration orders. Their discussion is based on unit root vector ARMA models which are separated in pure unit root structures and short-term dynamics. Although the analogy to our model is striking, there are notable differences between their unit root and our fractional setup. Firstly, as discussed in the paragraph on vector ARFIMA models (see (7)), the fractional components setup (1) is not nested within a general class comparable to the vector ARMA models, which form the basis of the discussion in Bauer and Wagner (2012). Secondly, in their setting, the introduction of different integration orders is achieved by repeated summation of lower order integrated processes which themselves enter the observations to achieve polynomial cointegration. This is in contrast to the continuous treatment of integration orders in our (type II) fractional setup.

However, fractional components models could be constructed to straightforwardly extend the setup of Bauer and Wagner (2012). Using the fractional lag operator $L_{b}=1-\Delta^{b}$ instead of $L$ in the short-run dynamic specification (4), a stable vector ARMAb process can be defined by $\tilde{\varPhi}(L_{b})\tilde{u}_{t}=\tilde{\varTheta}(L_{b})e_{t}$ under suitable stability conditions (Johansen, 2008, corollary 6). Then, replacing $u_{t}$ by $\tilde{u}_{t}$ in the model setup (1) with $d_{j}$ restricted to some multiple of $b$ ( $d_{j}=i_{j}b$ , $i_{j}\in\{1,2,\ldots\}$ ), the process $y_{t}$ is in the class of vector ARMAb models itself, while unit roots in the vector autoregressive polynomial generate the fractional $I(d_{j})$ processes. Such a framework could be treated analogously to Bauer and Wagner (2012), but the restriction that all integration orders are multiples of $b$ makes such a framework somewhat less flexible than ours.

4 A dimension-reduced orthogonal components specification

So far, we have considered a general modeling setup and discussed its integration and cointegration properties as well as its relation to existing approaches in the literature. We now turn to the discussion of a specific model from this class which bears potential for parsimonious modeling of long- and short-run dynamics in relatively high-dimensional applications. Besides its general interest, this will be the workhorse specification for the empirical application to realized covariance modeling in section 5.

To introduce the model and emphasize its restrictions as compared to (1), we decompose the short-term dependent process $u_{t}$ into an autocorrelated component, $\varGamma z_{t}$ , where $z_{t}$ is a vector of $s_{0}$ mutually uncorrelated components with $s+s_{0}\leq p$ , and a Gaussian white noise component $\varepsilon_{t}$ , respectively. We label the result the dynamic orthogonal fractional components (DOFC) model,

[TABLE]

where $x_{t}$ is generated by purely fractional processes (2) as above, while

[TABLE]

are $s_{0}$ univariate stationary autoregressive processes of order $k$ . Regarding the noise processes $\xi_{t}$ , $\zeta_{t}$ and $\varepsilon_{t}$ , we assume mutual independence over leads and lags,

[TABLE]

where $H$ is diagonal with entries $h_{i}>0$ , $i=1,\ldots,p$ . Note that for $s+s_{0}<p$ the DOFC model is a factor model as it allows for dimension reduction.

The model as specified in (11) and below is not identifiable without further information. Considering $\tilde{y}_{t}:=\Delta^{d^{(1)}}y_{t}$ instead of $y_{t}$ to meet the assumptions of Heaton and Solo (2004), their theorem 4 suggests that groups of common components $\Delta^{d^{(1)}}x_{t}^{(1)}$ , …, $\Delta^{d^{(1)}}x_{t}^{(q)}$ , $\Delta^{d^{(1)}}z_{t}$ can be disentangled (up to rotations within these groups) through their different shapes in spectral densities whenever $d^{(1)}>\ldots>d^{(q)}>0$ . Still, there exist observationally equivalent structures with $\tilde{\varLambda}^{(j)}=\varLambda^{(j)}M^{-1}$ and $\tilde{x}_{t}^{(j)}=Mx_{t}^{(j)}$ which satisfy the model restrictions for orthonormal $M$ . Hence, we impose further restrictions on the loading matrices. As is standard practice in dynamic factor analysis, we set the upper triangular elements to zero such that $\varLambda_{rl}^{(j)}=0$ for $r<l$ , $j=1,\ldots,q$ , and $\varGamma_{rl}=0$ for $r<l$ . Certain observables are thus assumed not to be influenced by certain factors.

The model (11) is very parsimonious considering that it includes both a rich fractional structure as well as short-run dynamics with co-dependence. This is possible by comprising three components of parsimony which have been brought forward in the statistical time series literature. Firstly, there are $p-s-s_{0}\geq 0$ white noise linear combinations of $y_{t}$ . A strict inequality implies a reduced dimension in the dynamics of $y_{t}$ which is characteristic for so-called statistical factor models; see Pan and Yao (2008), Lam et al. (2011) and Lam and Yao (2012). In contrast, the model (1) does not belong to this class in general, since it allows for $s\geq p$ and general forms of autocorrelation in $u_{t}$ . Secondly, all cross-sectional correlation stems from the common components which is a familiar feature from classical factor analysis (Anderson and Rubin, 1956). Thirdly, both the fractional and the nonfractional components are mutually orthogonal for all leads and lags.

Combined with semiparametric techniques of fractional integration and cointegration analysis, existing methods for statistical factor and dynamic orthogonal components analysis (Matteson and Tsay, 2011) can be used to justify the model assumptions and may be useful in the course of model specification. For final model inference, maximum likelihood estimation based on a state space representation is the preferred method. Both steps will be illustrated in the empirical application of the next section.

5 An application to realized covariance modeling

We apply the fractional components approach to the modeling and forecasting of multivariate realized stock market volatility which has recently received considerable interest in the financial econometrics literature.

5.1 Data and recent approaches

We use the dataset of Chiriac and Voev (2011) which comprises realized variances and covariances from six US stocks, namely (1) American Express Inc., (2) Citigroup, (3) General Electric, (4) Home Depot Inc., (5) International Business Machines and (6) JPMorgan Chase & Co for the period from 2000-01-01 to 2008-07-30 ( $n=2156$ ). The data are available from http://qed.econ.queensu.ca/jae/2011-v26.6/chiriac-voev.

Different transformations of the realized covariance matrices have been applied to fit dynamic models to data of this kind. Weigand (2014) discusses these transforms and considers a general framework nesting several previously applied approaches. His results suggest that applying linear models to a multivariate time series of log realized variances along with z-transformed realized correlations is a reasonable choice in practice. We follow this approach and base our empirical study on the 21-dimensional time series

[TABLE]

where $X_{t}$ is the $6\times 6$ realized covariance matrix at period $t$ , and the z-transforms are

[TABLE]

All time series (grey) of log variances and their maxima and minima for a given day $t$ (black) are depicted in figure 1, while z-transformed correlations are shown in figure 2.

Recent approaches to modeling realized covariance matrices have successfully used long-memory specifications (Chiriac and Voev, 2011), or found co-movements between the processes well-represented by dynamic factor structures; see Bauer and Vorkink (2011) and Gribisch (2013). In the related problem of forecasting univariate realized variances, factor models with long-memory dynamics have already been proposed. While Beltratti and Morana (2006) use frequency-domain principal components techniques to assess the low-frequency co-movements, Luciani and Veredas (2015) apply time-domain principal components to their high-dimensional series and apply fractional integration techniques to both estimated factors and idiosyncratic components. Recently, Asai and McAleer (2015) have considered long-memory factor dynamics also for the modeling of realized covariance matrices, where again a semiparametric factor approach precedes a long-memory analysis in their two-step approach.

Our fractional components model DOFC (11), applied to the time series (12), offers various advantages to researchers and practitioners in the field. (a) Our methods offer new insights in the integration and cointegration properties of stock market volatilities, for which fractional components structures of different integration orders have not been investigated so far. (b) Fractional cointegration between variances and correlations is of particular interest for the understanding of longer-term portfolio hedging and systemic risk assessment, but has not found attention in the existing literature. (c) Our state space approach for variances and correlations also features other relevant aspects of volatility modeling. It offers a separation into short-term and long-term components in the spirit of Engle and Lee (1999), directly accounts for measurement noise, and is applicable in datasets of higher dimensions. The parameter-driven state space approach our specification enables yields (d) practicability in case of missing values, while it (e) straightforwardly carries over to stochastic volatility frameworks for daily return data in the spirit of Harvey et al. (1994).

5.2 Preliminary analysis and model specification

We investigate whether the constraints imposed in the DOFC model (11) are reasonable for the dataset under investigation. Semiparametric methods are used to assess these restrictions and to obtain reasonable starting values for the parametric estimation of our model.

The model (11) implies that there are $s+s_{0}$ components which govern the dynamics of $y_{t}$ , and hence, for $p>s+s_{0}$ , there is a dimension reduction in terms of the autocorrelation characteristics. Pan and Yao (2008) study time series with such properties and propose a sequential test to infer the dynamic dimension of the process, allowing for nonstationarity of the autocorrelated components. The algorithm sequentially finds the least serially correlated linear combinations of $y_{t}$ , subsequently testing the null of no autocorrelation of these linear combinations. We apply 3 lags when detecting autocorrelations in what follows.

Applying this approach to our dataset, we do not reject the null for eight linear combinations which can hence be treated as white noise. For the ninth such combination, the p-value for the multivariate Ljung-Box test drops from 0.1935 to 0.0002, so that the white noise hypothesis is rejected for reasonable significance levels. We conclude that there are $s+s_{0}=21-8=13$ components which account for the dynamic properties of the process. Pan and Yao (2008) also propose an estimator for the space of dynamic components $(x_{t}^{\prime},z_{t}^{\prime})^{\prime}$ . We call these estimates (rotated by principal components) the factors in what follows.

Our model implies that $(x_{t}^{\prime},z_{t}^{\prime})^{\prime}$ and hence a suitable rotation of the factors can be modelled as $s+s_{0}$ univariate time series which are mutually orthogonal at all leads and lags. This corresponds to the notion of dynamic orthogonal components as introduced by Matteson and Tsay (2011) who provide methods to test for the presence of such a structure and to estimate the appropriate rotation. Using first differences of the factors to achieve stationarity as required by Matteson and Tsay (2011) for suitable values of $d_{j}$ , we find highly significant cross-correlations of the raw factors (the test statistic takes the value 4198.94 for a level 0.01 critical value of 625.80) while a dynamic orthogonal structure is not rejected for the rotated series, with a test statistic of 445.55 and a corresponding p-value close to one. The test result also holds if the test is conducted in levels. In what follows, the dynamic orthogonal components are computed from the factors in levels which slightly outperforms the difference-approach in simulations with fractional processes.222Results are available from the authors upon request.

Due to their dynamic orthogonality, the rotation of Matteson and Tsay (2011) identifies the single processes in $(x_{t}^{\prime},z_{t}^{\prime})^{\prime}$ up to scale, sign and order. A preliminary analysis of the integration orders of $x_{t}$ can hence be undergone by a univariate treatment of these series. We investigate these integration orders by the exact local Whittle estimator allowing for an unknown mean (Shimotsu, 2010).

A possible grouping of components with equal integration orders is assessed by the methods proposed by Robinson and Yajima (2002), with the modifications for possibly nonstationary integration orders by Nielsen and Shimotsu (2007). The specific-to-general approach of Robinson and Yajima (2002) sequentially tests for existence of $j=1,2,\ldots$ groups of equal integration orders. The sequence is terminated if for some $j^{*}$ there is a grouping for which within-group equality is not rejected, and for $j^{*}>1$ the grouping with highest p-value is selected. In our application, we restrict attention to possible groupings where, for $\hat{d}_{i_{1}}>\hat{d}_{i_{2}}>\hat{d}_{i_{3}}$ , there is no group including both $i_{1}$ and $i_{3}$ but not $i_{2}$ . For the tests of equal integration orders within the sequential approach, we consider the Wald test proposed by Nielsen and Shimotsu (2007), jointly testing all hypothesized equalities for a given grouping. We choose $m=\lfloor n^{0.5}\rfloor=46$ as bandwidth and set the trimming parameter $h$ to zero, since the dynamic orthogonal components structure does not permit fractional cointegration.

The estimated integration orders for the 13 dynamic orthogonal components range from 0.0087 to 0.7328 and indicate that some of the components may have short memory while others behave like stationary or nonstationary fractionally integrated processes. We clearly reject equality of all integration orders, while also each of the groupings in two groups can be rejected on a 0.01 significance level. For three groups, we do not reject the hypothesis of equal integration orders within groups. The sequential test for groups with equal memory yields $j^{*}=3$ with a p-value of 0.3181, where groups of three ( $\hat{d}^{(1)}=0.6717$ ), seven ( $\hat{d}^{(2)}=0.3448$ ) and three ( $\hat{d}^{(3)}=0.0523$ ) components are identified, respectively. The hypothesis that $d^{(3)}=0$ is not rejected. We may therefore treat the members of the third group as short-range dependent and belonging to $z_{t}$ . Thus, $s_{1}=3$ , $s_{2}=7$ and $s_{0}=3$ appear as a reasonable specification for model (11) due to the preliminary analysis.

We obtain starting values for the parametric estimator from this procedure. Firstly, $d$ and $\phi$ are estimated from the dynamic orthogonal components. Secondly, from regressing observed data on standardized estimated orthogonal components with unit innovation variance, we obtain starting values for $h$ , $\varLambda$ and $\varGamma$ , while certain columns of the latter matrices are rotated to satisfy the zero restrictions.

In very high-dimensional cases, the approach of Pan and Yao (2008) is not applicable, but Lam et al. (2011) and Lam and Yao (2012) provide feasible methods for stationary settings and comment on possible extensions to nonstationarity. In cases where the dynamic orthogonal components specification (11) is not appropriate, but the general setup (1) is, a specification search and preliminary estimates for the integration and cointegration parameters of the more general model could be based on the algorithm of Hualde (2009) which is capable of identifying and estimating cointegration subspaces by semiparametric methods.

5.3 A parametric fractional components analysis

We proceed with maximum likelihood estimation of the fractional components model using the EM algorithm of the state space representation. Although the exact state space respresentation is easily obtained using the current type II definition of fractional integration, the state dimension grows linearly with $n$ and becomes computationally infeasible. Instead, the latent fractionally integrated components are mapped to approximating ARMA(3,3) dynamics as described and justified by Hartl and Weigand (2018). There, we show by simulation that low-order ARMA approximations (with parameters depending both on $d_{j}$ and on $n$ ) provide an excellent approximation performance and outperform truncated moving average and autoregressive representations by large amounts.

We note that an asymptotic theory for maximum likelihood estimation in the fractionally cointegrated state space setup is not available. Certain functions of the parameter estimates are expected to exert nonstandard asymptotic behavior, especially in the nonstationary case $d_{j}>0.5$ for some $j$ . However, normal and mixed normal asymptotics have been established and conventional tests and confidence intervals have been justified in different parametric fractional cointegration settings as well as in state space models with common unit root components (Chang et al., 2009, 2012). We thus use standard parameter tests in what follows, bearing the preceding caveats in mind.

Constant terms are included by a further column $c$ in the observation matrix and estimated along with the free elements of $\varLambda$ and $\varGamma$ . Setting the autoregressive order of $z_{t}$ to one and using starting values as described above, we estimate models with $q\in\{1,2,3\}$ groups of equal integration orders $d^{(j)}>0$ and additional autoregressive components. The Bayesian information criterion (BIC) is used to select sizes $s_{0},\ldots,s_{q}$ and the value of $q$ with appropriate in-sample fit.333Instead of estimating all reasonable combinations of $s_{0}$ , …, $s_{q}$ for each $q$ , we begin by the optimal grouping for a given $q$ obtained from the semiparametric methods of the previous section. From this specification, denoted as $s^{\{0\}}_{j}$ , $j=0,\ldots,q$ , we estimate all models characterized by $s_{j}\in\{s^{\{0\}}_{j}-1,s^{\{0\}}_{j},s^{\{0\}}_{j}+1\}$ , $j=0,\ldots,q$ , given that they satisfy $s+s_{0}-1\geq s_{j}^{\{0\}}\geq 1$ . The model with the least value of the BIC is selected and its indices denoted as $s^{\{1\}}_{j}$ , and again models with indices close to $s^{\{1\}}_{j}$ are estimated and compared. This process is iterated until $s^{\{i\}}_{j}=s^{\{i-1\}}_{j}$ holds for all $j=0,\ldots,q$ . As a result, also the number of white noise combinations may differ from 8, the result of the semiparametric analysis in the previous section. We apply the BIC even if consistency is not established in this fractional setting. We expect that existing results hold for specification choices not involving the fractional components, while it is not clear to what extent the results of Chang et al. (2012) carry over to the fractional setup. There, consistency of the BIC is shown for the number of stochastic trends in a unit root state space model.

We complement the semiparametric results of the previous section by a parametric specification search. After diagnostic checking of the selected model, we will take a closer look at its parameter estimates and implied long-run characteristics. The best models for each $q$ are shown in table 1, where estimated integration orders are given along with the log-likelihood (log-lik) and the BIC. Regarding the integration orders, we find that for $q>1$ estimates of $d^{(1)}$ are always above 0.5 suggesting nonstationarity of at least $s_{1}$ series in $y_{t}$ . Overall, the models with $q=2$ are superior, in particular the grouping in $s_{1}=2$ and $s_{2}=9$ fractional and $s_{0}=2$ nonfractional components. This specification is similar to the one selected by the semiparametric approach and also suggests a dynamic dimension of $s+s_{0}=13$ . Interestingly, the same specification with full noise covariance matrix $H$ is inferior ( $BIC=-16.626$ ) as is the model with a full vector autoregressive matrix $\varPhi$ ( $BIC=-17.150$ ). Furthermore, considering more lags in $z_{t}$ does not sufficiently improve the fit ( $BIC=-17.155$ for $k=2$ , $BIC=-17.046$ for $k=3$ and $BIC=-17.139$ for $k=4$ ).

We conduct several diagnostic tests on standardized model residuals $e_{it}=v_{it}/\sqrt{F_{ii,t}}$ , where $v_{t}$ and $F_{t}$ are filtered residuals and forecast error covariance matrices, respectively. The residuals corresponding to log variances and z-transformed correlations for the first three assets are plotted in figure 3, while residual autocorrelations are depicted in figure 4, autocorrelations of squared residuals in figure 5 and histograms of the residuals along with the normal density in figure 6. The visual inspection shows some but no overwhelming evidence against the model assumptions. Autocorrelation both of residuals and squared residuals are generally below 0.1 in absolute value and mostly within the $\pm$ 2 standard error bands which are shown as horizontal lines. Some deviations from normality are visible, but not the sort of skewness and fat tails observed for models of untransformed residual variances and covariances.

Table 2 presents the diagnostic tests on standardized residuals. The p-values are shown for the Ljung-Box test (LM) and the ARCH-LM test for conditional heteroscedasticity (CH) for different lag length 5, 10 and 22. Additionally, the Jarque-Bera test result (JB) is shown in the last column. The null of no autocorrelation is not rejected at the 0.01 level for all but two or three residuals, depending on lag length. Clear evidence of conditional heteroskedasticity is found for the residuals of the log variance series, that is $e_{2t}$ , $e_{3t}$ , $e_{5t}$ , and $e_{6t}$ , where also the normality assumption is clearly rejected, but also for a few correlation series such as $e_{15,t}$ or $e_{19,t}$ . A more flexible data transformation like the matrix Box-Cox approach of Weigand (2014) would typically ameliorate these findings, but we do not follow this approach further here.

Estimates of several of the model parameters are shown in table 3. Along with the maximum likelihood estimates, we also show the mean of the estimators from a model-based bootstrap resampling exercise with 1000 iterations and generally find a low bias for the corresponding estimates. We also show standard errors, obtained in three ways, namely by the bootstrap (SE.boot), using the information matrix (Harvey, 1991, section 3.4.5), denoted by SE.info, and by the sandwich form White (1982), labelled SE.sand in the table. The different methods of computing standard errors give similar results, except for the variance parameters $h_{i}$ , where the sandwich estimates are large compared to the others. Overall, including the parameters not shown in the table, the median ratio between bootstrap and sandwich standard errors is 1.31, while a typical sandwich estimate is 1.20 times larger than the corresponding estimate from the information matrix. We hence use the bootstrap methods in order to avoid a possible underestimation of the variances and spurious inference.

The estimated memory parameters $d_{1}$ and $d_{2}$ exert a marked difference in the integration orders of fractional components. The two series in the first group are the cause of significant nonstationarity in our dataset. The second group of nine series introduces stationary long-memory persistence. In contrast, the nonfractional components in $z_{t}$ are only mildly autocorrelated, with small but significant autoregression parameters. Figure 7 gives a visual impression of the factor dynamics, showing full sample (smoothed) estimates of the two nonstationary components (above), of the first two stationary long-memory components (middle) and of the short-memory components (below). The $\pm$ 2 standard error confidence intervals suggest a relatively precise estimation of the components. The different persistence of the three groups is clearly visible.

We turn to a discussion of the cointegration properties of the estimated system. In our preferred specification with a cointegration rank of $p-s_{1}=19$ , and an 11-dimensional cointegration subspace, the loadings of fractional components provide an easier interpretation than the corresponding cointegration vectors, although the latter can be easily obtained and suitably normalized.

With the abovementioned caveat that asymptotic properties are not available for this fractional cointegration setting, we show $t$ -ratios for constants, for fractional loadings and for nonfractional loadings in table 4, where the bootstrap standard errors are used. The $t$ -ratios for $\varLambda^{(1)}$ suggest that each of the series in $y_{t}$ is influenced by the nonstationary components, and hence all components of $y_{t}$ are nonstationary themselves. The first component loads very significantly on all variances with the same sign and can hence be interpreted as the main common risk factor. The second component represents joint common nonstationarity of the correlations, which is negatively associated with the IBM return variances. Except those corresponding to the first, the second and the forth stationary components with their equal signs, the columns of $\varLambda^{(2)}$ have a rather mixed pattern. Like the nonstationary factors, also the $I(d^{(2)})$ components affect variance and correlation dynamics at the same time and therefore induce fractional cointegration between log variances and z-transformed correlations.

The finding of nonstationary fractional components affecting variances and correlations at the same time is new to the literature and may have remarkable consequences on portfolio selection and hedging opportunities, even at longer horizons. These effects should also be relevant to systemic risk measures as considered by central banks and regulators worldwide. To shed further light on the practical value of our approach, we turn to an evaluation of the forecasting precision in a real-world scenario in the following section.

5.4 An out-of-sample comparison

We assess the forecasting performance of our model by means of an out-of-sample comparison. To avoid reference of the forecasts on the out-of-sample periods, we conduct a semiparametric specification search along the lines of section 5.2 for the first estimation sample only, i.e. for $y_{t}$ , $t=1,\ldots,1508$ , while $t=1509,\ldots,2156$ is reserved for prediction and therefore not used for selecting the specification. In this way, the model for the forecasting comparison includes $s_{1}=2$ , $s_{2}=7$ and $s_{0}=3$ components of different integration orders. Rather than conducting comprehensive comparisons of a wide range of available methods which is beyond the scope of this paper, we select straightforward and simple benchmark models which have performed well in previous studies.

We choose the same out-of-sample setup as in Weigand (2014). Thus, for each $T^{\prime}\in[1508;2156-h]$ , various competing models are estimated for a rolling sample with $n=1508$ observations, $y_{T^{\prime}-1507},\ldots,y_{T^{\prime}}$ . From these estimates, forecasts of $y_{T^{\prime}+h}$ , $h=1,5,10,20$ , are computed. Also in line with Weigand (2014), we compute bias-corrected forecasts of the realized covariance matrices $\hat{X}_{T^{\prime}+h|T^{\prime}}$ by the simulation-based technique discussed there. We evaluate the forecasting accuracy using the ex-post available data of the respective period.

The forecasting precision is assessed using different loss functions defined in appendix B. We consider the Frobenius norm $LF_{T^{\prime},h}$ (17), the Stein norm $LS_{T^{\prime},h}$ (18) and the asymmetric loss $L3_{T^{\prime},h}$ (19); see Laurent et al. (2011) and Laurent et al. (2013). Additionally, the ex-ante minimum variance portfolio is computed from the forecast and its realized variance $LMV_{T^{\prime},h}$ (20) used as a loss with obvious economic relevance. Furthermore, we assess density forecasts $f_{r}$ of the daily returns using covariance matrices, which are evaluated at the daily returns $r_{T^{\prime}+h}$ in a logarithmic scoring rule $LD_{T^{\prime},h}$ (21).

As benchmarks, we consider two linear models for the log variance and z-transformed correlation series $y_{t}$ , namely a diagonal vector ARMA(2,1) and a diagonal vector ARFIMA(1, $d$ ,1) model, which have been found to perform well by Weigand (2014). Additionally, the diagonal vector ARFIMA(1, $d$ ,1) model is applied to the Cholesky factors of the covariance matrices (Chiriac and Voev, 2011). Furthermore, we consider models with a conditional Wishart distribution, namely the conditional autoregressive Wishart (CAW) model of Golosnoy et al. (2012), a dynamic correlation specification (CAW-DCC) of Bauwens et al. (2012), and additive and multiplicative components Wishart models as proposed by Jin and Maheu (2013). For further details on the comparison models consult appendix B.

For each loss function and horizon $h$ , we compute the average losses (risks) for all models and obtain model confidence sets of Hansen et al. (2011), bootstrapping the max- $t$ statistic with a block lengths of $\max\{5,h\}$ . In tables 5, 6, 7 and 8, we present the risks for $h=1,5,10,20$ . The best performing model (∗∗∗) as well as members of the 80% model confidence set (∗∗) and models contained in the 90% but not in the 80% set (∗) are indicated.

The fractional components model is among the best competitors for all horizons and loss functions. It has lowest risks for almost all setups. Exceptions occur for $h\geq 10$ where the ARFIMA model for log variances and z-correlations performs best in some cases. Overall, the ARFIMA model on $y_{t}$ appears as second best in terms of forecasting precision.

The DOFC model is always contained in the 80% model confidence set whereas all other models are rejected at least in some cases. For the Stein loss and the minimum-variance loss, the DOFC model is significantly superior than most competitors for small horizons, while with the Frobenius and asymmetric loss, rejections of other models are achieved for $h=10$ and $h=20$ .

The performance of the fractional components model in terms of density forecasting is noteworthy. In each case there, our model is either the single member or one of two models in the confidence set and hence significantly outperforms most of the competitors. Since the behaviour of future daily returns is usually more important than the realized measures themselves, this finding is particularly strong from a practitioner’s perspective.

Overall, we find a very good forecast performance of the model proposed in this paper. Although for some criteria and horizons statistical significance is lacking, the model yields very precise forecasts in relation to different competitors for all considered horizons and for several ways to measure this precision.

6 Conclusion

We have suggested a general setup and a parsimonious model with very general fractional integration and cointegration properties. We discussed the usefulness of our approach for multivariate realized volatility modeling. In our application it was shown to provide a reasonable in-sample fit and competitive out-of-sample forecasting accuracy.

Several questions remain for further research. From an empirical point of view, we have shown the relevance of a very restricted specification in financial econometrics, but the general setup we introduced has a broader scope. Fractional components models with rich short-run dynamics may be considered for models of smaller dimension. In several empirical setups, fractional integration and cointegration has been found relevant, so that dynamic modeling, forecasting, identification of structural shocks and impulse response analyses in an according framework is a fruitful direction of ongoing research.

Acknowledgements

The research of Roland Weigand has mostly been done at the Institute of Economics and Econometrics of the University of Regensburg and at the Institute for Labour Market Research (IAB) in Nuremberg. Very valuable comments by Rolf Tschernig, by Enzo Weber and by participants of the Interdisciplinary Workshop on Multivariate Time Series Modeling 2011 in Louvain La Neuve, at the Statistische Woche 2011 in Leipzig, and of research seminars at the Universities of Regensburg, Augsburg and Bielefeld are gratefully acknowledged. The authors are also thankful to Niels Aka for providing R codes to estimate model confidence sets. Tobias Hartl gratefully acknowledges support through the projects TS283/1-1 and WE4847/4-1 financed by the German Research Foundation (DFG).

Appendix A Details on alternative representations

In this appendix we provide more details on the derivation of the alternative representations of the fractional components model (1) which we discuss in section 3.

To derive the error correction representation (6), we start from the FC setup with $q=2$ and $s=p$ ,

[TABLE]

from which we note that

[TABLE]

We define

[TABLE]

and make use of $I=N+M$ (Johansen, 2008), to obtain

[TABLE]

Adding and substracting $\Delta^{d^{(2)}}Ny_{t}$ on the right side of (14) and the decomposition $N=-\alpha\beta^{\prime}$ yields (6).

Next, we consider the triangular representation; see (8) and (9). The first block, (8), is easily obtained. Since $\varLambda^{(1,1)}$ is nonsingular and we also assumed a nonsingular covariance matrix of the white noise sequence $\xi_{t}$ , we find that the first term on the right is $I(0)$ with positive definite spectral density while the other terms have integration orders lower than zero, leading to $\omega_{t}^{(1)}\sim I(0)$ . To arrive at the $j$ -th block of the system, consider the expression for $y_{t}^{(j)}$ ,

[TABLE]

Since $\Delta^{d^{(j)}}x_{t}^{(i)}$ is integrated of order zero or lower for $i\geq j$ , we can write

[TABLE]

where $\tilde{\omega}_{t}^{j}\sim I(0)$ . To substitute for the latent variables in this expression, consider

[TABLE]

with $\check{\omega}_{t}^{j}\sim\;I(0)$ which we can solve for

[TABLE]

Substituting this expression into (A) yields the general expression (9) for the $j$ -th block of the triangular system for $j=2,\ldots,q$ , where

[TABLE]

which can be stated in greater detail as

[TABLE]

This process is the sum of several additive negatively integrated plus a white noise process

[TABLE]

so that we conclude that $\omega_{t}^{(j)}$ is $I(0)$ with positive definite spectral density at zero frequency.

We arrive at the representation (10) where $B$ is partitioned into blocks according to

[TABLE]

In case $p>s$ , we have

[TABLE]

and the representation (10) is changed to

[TABLE]

where $B$ is extended by the $p-s$ rows $(B^{(q+1,1)},\ldots,B^{(q+1,q)},I)$ .

Appendix B Details on the out-of-sample comparison

In this section we give further details on the out-of-sample evaluation of section 5.4. We state the loss functions to evaluate the forecasts as well as the specifications of the benchmark models and their estimation.

For given forecasted realized covariance matrices $X_{T^{\prime}+h|T^{\prime}}$ and realizations $X_{T^{\prime}+h}$ , the loss functions considered in this paper are the Frobenius norm ( $LF_{T^{\prime},h}$ ), the Stein distance ( $LS_{T^{\prime},h}$ ), the asymmetric loss ( $L3_{T^{\prime},h}$ ), the realized variance of the ex-ante minimum variance portfolio ( $LMV_{T^{\prime},h}$ ), and the negative log-score of density forecasts $f_{r}$ ( $LD_{T^{\prime},h}$ ), given by

[TABLE]

As comparison models we consider three linear models in transformed covariance matrices, namely the diagonal vector ARMA(2,1) model

[TABLE]

for the log variance and z-correlation series $y_{t}$ , a diagonal vector ARFIMA(1, $d$ ,1) model

[TABLE]

for $y_{t}$ and the same model (23) applied to Cholesky factors. The same model orders have been used by Chiriac and Voev (2011) and Weigand (2014) and were found to compete favorably with other choices. The dynamic parameters of these models are estimated by Gaussian quasi maximum likelihood equation by equation, with no cross-equation restrictions such as equality of memory parameters. A full covariance matrix of the error terms is estimated from the residuals.

The other four benchmark models are based on a conditional Wishart distribution,

[TABLE]

where $\mathcal{I}_{t}$ is the information set consisting of $X_{s}$ , $s\leq t$ , $W_{n}$ denotes the central Wishart density, $\nu$ is the scalar degrees of freedom parameter and $S_{t}/\nu$ is a $(6\times 6)$ positive definite scale matrix, which is related to the conditional mean of $X_{t}$ by $E[X_{t}|{\cal I}_{t-1}]=S_{t}$ . The baseline CAW(p,q) model of Golosnoy et al. (2012) specifies the conditional mean as

[TABLE]

$C$ , $B_{j}$ and $A_{j}$ denoting $(6\times 6)$ parameter matrices, while the CAW-DCC model of Bauwens et al. (2012) employs a decomposition $S_{t}=H_{t}P_{t}H_{t}^{\prime}$ where $H_{t}$ is diagonal and $P_{t}$ is a well-defined correlation matrix. As a sparse and simple DCC benchmark we apply univariate realized GARCH( $p_{v}$ , $q_{v}$ ) specifications for the realized variances

[TABLE]

along with the ‘scalar Re-DCC’ model (Bauwens et al., 2012) for the realized correlation matrix $R_{t}$ ,

[TABLE]

The diagonal CAW( $p$ , $q$ ) and the CAW-DCC( $p$ , $q$ ) specification with $p=p_{v}=p_{c}=2$ and $q=q_{v}=q_{c}=1$ are selected since they provide a reasonable in-sample fit among various order choices. They are estimated by maximum likelihood using variance and correlation targeting.

Bibliography66

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1(1)
2Anderson and Rubin (1956) Anderson, T. W. and Rubin, H. (1956). Statistical inference in factor analysis, in J. Neyman (ed.), Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, Vol. V , University of California Press.
3Asai and Mc Aleer (2015) Asai, M. and Mc Aleer, M. (2015). Forecasting co-volatilities via factor models with asymmetry and long memory in realized covariance, Journal of Econometrics 189 (2): 251–262.
4Avarucci and Velasco (2009) Avarucci, M. and Velasco, C. (2009). A Wald test for the cointegration rank in nonstationary fractional systems, Journal of Econometrics 151 (2): 178 – 189.
5Bauer and Wagner (2012) Bauer, D. and Wagner, M. (2012). A state space canonical form for unit root processes, Econometric Theory 28 (6): 1313–1349.
6Bauer and Vorkink (2011) Bauer, G. H. and Vorkink, K. (2011). Forecasting multivariate realized stock market volatility, Journal of Econometrics 160 (1): 93–101.
7Bauwens et al. (2012) Bauwens, L., Storti, G. and Violante, F. (2012). Dynamic conditional correlation models for realized covariance matrices, CORE Discussion Paper 2012-60 . https://Econ Papers.repec.org/Re P Ec:cor:louvco:2012060
8Beltratti and Morana (2006) Beltratti, A. and Morana, C. (2006). Breaks and persistency: macroeconomic causes of stock market volatility, Journal of Econometrics 131 (1-2): 151–177.