Classical and bayesian componentwise predictors for non-compact   correlated ARH(1) processes

M. Dolores Ruiz-Medina; J. \'Alvarez-Li\'ebana

arXiv:1704.05630·stat.AP·September 5, 2018

Classical and bayesian componentwise predictors for non-compact correlated ARH(1) processes

M. Dolores Ruiz-Medina, J. \'Alvarez-Li\'ebana

PDF

Open Access

TL;DR

This paper investigates classical and Bayesian componentwise predictors for a special class of Gaussian ARH(1) processes with non-Hilbert-Schmidt autocorrelation operators, providing theoretical analysis and simulation validation.

Contribution

It introduces and analyzes diagonal classical and Bayesian estimators for non-compact ARH(1) processes, proving their asymptotic efficiency and equivalence.

Findings

01

Both estimators are asymptotically efficient.

02

Classical and Bayesian predictors are asymptotically equivalent.

03

Simulation confirms theoretical results.

Abstract

A special class of standard Gaussian Autoregressive Hilbertian processes of order one (Gaussian ARH(1) processes), with bounded linear autocorrelation operator, which does not satisfy the usual Hilbert-Schmidt assumption, is considered. To compensate the slow decay of the diagonal coefficients of the autocorrelation operator, a faster decay velocity of the eigenvalues of the trace autocovariance operator of the innovation process is assumed. As usual, the eigenvectors of the autocovariance operator of the ARH(1) process are considered for projection, since, here, they are assumed to be known. Diagonal componentwise classical and bayesian estimation of the autocorrelation operator is studied for prediction. The asymptotic efficiency and equivalence of both estimators is proved, as well as of their associated componentwise ARH(1) plugin predictors. A simulation study is undertaken to…

Tables6

Table 1. Table 1 : Example 1. Empirical functional mean-square errors E F M S E ρ ¯ T 𝐸 𝐹 𝑀 𝑆 subscript 𝐸 subscript ¯ 𝜌 𝑇 EFMSE_{\overline{\rho}_{T}} .

Sample size	Classical estimator ${\hat{ρ}}_{T}$	Bayes estimator ${\tilde{ρ}}_{T}^{-}$
$250$	$2.13 {(10)}^{- 3}$	$2.23 {(10)}^{- 3}$
$500$	$1.24 {(10)}^{- 3}$	$1.04 {(10)}^{- 3}$
$750$	$8.44 {(10)}^{- 4}$	$7.13 {(10)}^{- 4}$
$1000$	$6.91 {(10)}^{- 4}$	$5.84 {(10)}^{- 4}$
$1250$	$5.97 {(10)}^{- 4}$	$4.72 {(10)}^{- 4}$
$1500$	$4.89 {(10)}^{- 4}$	$3.98 {(10)}^{- 4}$
$1750$	$4.13 {(10)}^{- 4}$	$3.06 {(10)}^{- 4}$
$2000$	$3.61 {(10)}^{- 4}$	$2.59 {(10)}^{- 4}$

Table 2. Table 2 : Example 1. Empirical functional mean-square errors E F M S E ρ ¯ T ( X T ) 𝐸 𝐹 𝑀 𝑆 subscript 𝐸 subscript ¯ 𝜌 𝑇 subscript 𝑋 𝑇 EFMSE_{\overline{\rho}_{T}\left(X_{T}\right)} .

Sample size	Classical predictor ${\hat{ρ}}_{T} (X_{T})$	Bayes predictor ${\tilde{ρ}}_{T}^{-} (X_{T})$
$250$	$1.22 {(10)}^{- 3}$	$1.42 {(10)}^{- 3}$
$500$	$6.08 {(10)}^{- 4}$	$6.36 {(10)}^{- 4}$
$750$	$3.24 {(10)}^{- 4}$	$4.06 {(10)}^{- 4}$
$1000$	$3.05 {(10)}^{- 4}$	$2.77 {(10)}^{- 4}$
$1250$	$2.74 {(10)}^{- 4}$	$2.39 {(10)}^{- 4}$
$1500$	$2.07 {(10)}^{- 4}$	$1.78 {(10)}^{- 4}$
$1750$	$1.71 {(10)}^{- 4}$	$1.48 {(10)}^{- 4}$
$2000$	$1.64 {(10)}^{- 4}$	$1.42 {(10)}^{- 4}$

Table 3. Table 3 : Example 2. Empirical functional mean–square errors E F M S E ρ ¯ T 𝐸 𝐹 𝑀 𝑆 subscript 𝐸 subscript ¯ 𝜌 𝑇 EFMSE_{\overline{\rho}_{T}} .

Sample size	Classical estimator ${\hat{ρ}}_{T}$	Bayes estimator ${\tilde{ρ}}_{T}^{-}$
$250$	$4.18 {(10)}^{- 3}$	$6.09 {(10)}^{- 3}$
$500$	$2.20 {(10)}^{- 3}$	$2.30 {(10)}^{- 3}$
$750$	$1.52 {(10)}^{- 3}$	$1.39 {(10)}^{- 3}$
$1000$	$1.14 {(10)}^{- 3}$	$1.00 {(10)}^{- 3}$
$1250$	$9.55 {(10)}^{- 4}$	$7.97 {(10)}^{- 4}$
$1500$	$7.97 {(10)}^{- 4}$	$6.64 {(10)}^{- 4}$
$1750$	$7.01 {(10)}^{- 4}$	$5.37 {(10)}^{- 4}$
$2000$	$6.22 {(10)}^{- 4}$	$5.00 {(10)}^{- 4}$

Table 4. Table 4 : Example 2. Empirical functional mean–square errors E F M S E ρ ¯ T ( X T ) 𝐸 𝐹 𝑀 𝑆 subscript 𝐸 subscript ¯ 𝜌 𝑇 subscript 𝑋 𝑇 EFMSE_{\overline{\rho}_{T}\left(X_{T}\right)} .

Sample size	Classical predictor ${\hat{ρ}}_{T} (X_{T})$	Bayes predictor ${\tilde{ρ}}_{T}^{-} (X_{T})$
$250$	$3.25 {(10)}^{- 3}$	$3.18 {(10)}^{- 4}$
$500$	$1.59 {(10)}^{- 3}$	$1.40 {(10)}^{- 4}$
$750$	$9.47 {(10)}^{- 4}$	$8.19 {(10)}^{- 4}$
$1000$	$7.89 {(10)}^{- 4}$	$6.88 {(10)}^{- 4}$
$1250$	$7.24 {(10)}^{- 4}$	$6.10 {(10)}^{- 4}$
$1500$	$5.53 {(10)}^{- 4}$	$4.77 {(10)}^{- 4}$
$1750$	$5.31 {(10)}^{- 4}$	$4.49 {(10)}^{- 4}$
$2000$	$4.61 {(10)}^{- 4}$	$4.00 {(10)}^{- 4}$

Table 5. Table 5 : Example 3. Empirical functional mean-square errors E F M S E ρ ¯ T 𝐸 𝐹 𝑀 𝑆 subscript 𝐸 subscript ¯ 𝜌 𝑇 EFMSE_{\overline{\rho}_{T}} .

Sample size	$k_{T}$	Classical estimator ${\hat{ρ}}_{T}$	Bayes estimator ${\tilde{ρ}}_{T}^{-}$
$250$	$3$	$1.73 {(10)}^{- 3}$	$1.52 {(10)}^{- 3}$
$500$	$4$	$9.72 {(10)}^{- 4}$	$1.01 {(10)}^{- 3}$
$750$	$5$	$6.98 {(10)}^{- 4}$	$7.10 {(10)}^{- 4}$
$1000$	$5$	$5.63 {(10)}^{- 4}$	$4.35 {(10)}^{- 4}$
$1250$	$5$	$4.49 {(10)}^{- 4}$	$2.84 {(10)}^{- 4}$
$1500$	$5$	$3.94 {(10)}^{- 4}$	$2.24 {(10)}^{- 4}$
$1750$	$6$	$3.31 {(10)}^{- 4}$	$1.84 {(10)}^{- 4}$
$2000$	$7$	$3.05 {(10)}^{- 4}$	$1.70 {(10)}^{- 4}$

Table 6. Table 6 : Example 3. Empirical functional mean–square errors E F M S E ρ ¯ T ( X T ) 𝐸 𝐹 𝑀 𝑆 subscript 𝐸 subscript ¯ 𝜌 𝑇 subscript 𝑋 𝑇 EFMSE_{\overline{\rho}_{T}\left(X_{T}\right)} .

Sample size	$k_{T}$	Classical predictor ${\hat{ρ}}_{T} (X_{T})$	Bayes predictor ${\tilde{ρ}}_{T}^{-} (X_{T})$
$250$	$3$	$1.92 {(10)}^{- 3}$	$1.31 {(10)}^{- 3}$
$500$	$4$	$8.24 {(10)}^{- 4}$	$5.75 {(10)}^{- 4}$
$750$	$5$	$5.60 {(10)}^{- 4}$	$4.08 {(10)}^{- 4}$
$1000$	$5$	$3.52 {(10)}^{- 4}$	$2.54 {(10)}^{- 4}$
$1250$	$5$	$2.62 {(10)}^{- 4}$	$1.45 {(10)}^{- 4}$
$1500$	$5$	$2.00 {(10)}^{- 4}$	$1.02 {(10)}^{- 4}$
$1750$	$6$	$1.37 {(10)}^{- 4}$	$9.57 {(10)}^{- 5}$
$2000$	$6$	$1.13 {(10)}^{- 4}$	$8.55 {(10)}^{- 5}$

Equations255

θ_{j} ⊥ {X_{i, j^{'}}, i \geq 1, j^{'} \neq = j},

θ_{j} ⊥ {X_{i, j^{'}}, i \geq 1, j^{'} \neq = j},

⟨ E {θ ∣ X_{1}, \dots, X_{n}}, v_{j} ⟩_{H} = E {θ_{j} ∣ X_{1}, \dots, X_{n}} = E {θ_{j} ∣ X_{1, j}, \dots, X_{n, j}} .

⟨ E {θ ∣ X_{1}, \dots, X_{n}}, v_{j} ⟩_{H} = E {θ_{j} ∣ X_{1}, \dots, X_{n}} = E {θ_{j} ∣ X_{1, j}, \dots, X_{n, j}} .

X_{n} = Y_{n} - μ = ρ (Y_{n - 1} - μ) + ε_{n} = ρ (X_{n - 1}) + ε_{n}, n \in Z,

X_{n} = Y_{n} - μ = ρ (Y_{n - 1} - μ) + ε_{n} = ρ (X_{n - 1}) + ε_{n}, n \in Z,

C

C

f \otimes g (h) = f ⟨ g, h ⟩_{H}, \forall h \in H,

f \otimes g (h) = f ⟨ g, h ⟩_{H}, \forall h \in H,

E {∥ X_{n} ∥_{H}^{2}} < \infty, n \in Z .

E {∥ X_{n} ∥_{H}^{2}} < \infty, n \in Z .

⟨ C^{- 1} (f), C^{- 1} (g) ⟩_{H} = ⟨ C^{- 1} (C (φ)), C^{- 1} (C (ϕ)) ⟩_{H} = ⟨ φ, ϕ ⟩_{H} .

⟨ C^{- 1} (f), C^{- 1} (g) ⟩_{H} = ⟨ C^{- 1} (C (φ)), C^{- 1} (C (ϕ)) ⟩_{H} = ⟨ φ, ϕ ⟩_{H} .

∥ C^{- 1} (f) ∥_{H}^{2} < \infty, \forall f \in H .

∥ C^{- 1} (f) ∥_{H}^{2} < \infty, \forall f \in H .

C (g) (f)

C (g) (f)

ρ (g) (f)

∣ ρ (g) (f) ∣^{2}

∣ ρ (g) (f) ∣^{2}

X_{n} = k = 1 \sum \infty C_{k} η_{k} (n) ϕ_{k}, η_{k} (n) = \frac{1}{C _{k}} ⟨ X_{n}, ϕ_{k} ⟩_{H},

X_{n} = k = 1 \sum \infty C_{k} η_{k} (n) ϕ_{k}, η_{k} (n) = \frac{1}{C _{k}} ⟨ X_{n}, ϕ_{k} ⟩_{H},

E {η_{j} (n) η_{p} (n)}

E {η_{j} (n) η_{p} (n)}

C_{k} η_{k} (n) = ρ_{k} C_{k} η_{k} (n - 1) + ε_{k} (n), k \geq 1,

C_{k} η_{k} (n) = ρ_{k} C_{k} η_{k} (n - 1) + ε_{k} (n), k \geq 1,

η_{k} (n) = ρ_{k} η_{k} (n - 1) + \frac{ε _{k} ( n )}{C _{k}}, k \geq 1,

η_{k} (n) = ρ_{k} η_{k} (n - 1) + \frac{ε _{k} ( n )}{C _{k}}, k \geq 1,

ε_{k} (n) = ⟨ ε_{n}, ϕ_{k} ⟩_{H}, k \geq 1, n \in Z .

ε_{k} (n) = ⟨ ε_{n}, ϕ_{k} ⟩_{H}, k \geq 1, n \in Z .

{a_{j} (n) = C_{j} η_{j} (n), n \in Z}

{a_{j} (n) = C_{j} η_{j} (n), n \in Z}

a_{j} (n) = k = 0 \sum \infty [ρ_{j}]^{k} ε_{j} (n - k), n \in Z .

a_{j} (n) = k = 0 \sum \infty [ρ_{j}]^{k} ε_{j} (n - k), n \in Z .

E {a_{j} (n) a_{p} (n)}

E {a_{j} (n) a_{p} (n)}

E {a_{j} (n) a_{p} (n)}

σ_{j}^{2} = E {ε_{j} (n - k)}^{2} = E {ε_{j} (0)}^{2} .

σ_{j}^{2} = E {ε_{j} (n - k)}^{2} = E {ε_{j} (0)}^{2} .

E {∥ X (n) ∥_{H}^{2}}

E {∥ X (n) ∥_{H}^{2}}

j = 1 \sum \infty σ_{j}^{2} = E {∥ ε_{n} ∥_{H}^{2}} < \infty.

j = 1 \sum \infty σ_{j}^{2} = E {∥ ε_{n} ∥_{H}^{2}} < \infty.

C_{j} = [\frac{σ _{j}^{2}}{1 - ρ _{j}^{2}}], j \geq 1,

C_{j} = [\frac{σ _{j}^{2}}{1 - ρ _{j}^{2}}], j \geq 1,

ρ_{k} = 1 - \frac{σ _{k}^{2}}{λ _{k} ( C )}, σ_{k}^{2} = E {⟨ ϕ_{k}, ε_{n} ⟩_{H}}^{2}, \forall n \in Z, k \geq 1.

ρ_{k} = 1 - \frac{σ _{k}^{2}}{λ _{k} ( C )}, σ_{k}^{2} = E {⟨ ϕ_{k}, ε_{n} ⟩_{H}}^{2}, \forall n \in Z, k \geq 1.

η_{k} (n) = ρ_{k} η_{k} (n - 1) + 1 - ρ_{k}^{2} \frac{ε _{k} ( n )}{σ _{k}}, k \geq 1,

η_{k} (n) = ρ_{k} η_{k} (n - 1) + 1 - ρ_{k}^{2} \frac{ε _{k} ( n )}{σ _{k}}, k \geq 1,

{σ_{k}^{2}, k \geq 1}, {C_{k}, k \geq 1}

{σ_{k}^{2}, k \geq 1}, {C_{k}, k \geq 1}

\frac{σ _{k}^{2}}{C _{k}} \leq 1, k \geq 1, k \to \infty lim \frac{σ _{k}^{2}}{C _{k}} = 0,

\frac{σ _{k}^{2}}{C _{k}} \leq 1, k \geq 1, k \to \infty lim \frac{σ _{k}^{2}}{C _{k}} = 0,

\frac{σ _{k}^{2}}{C _{k}} = O (k^{- 1 - γ}), γ > 0, k \to \infty.

n = 1 \sum T [η_{k} (n - 1)]^{2}

n = 1 \sum T [η_{k} (n - 1)]^{2}

S (T, k) = n = 1 \sum T l = 1 \sum \infty p = 1 \sum \infty [ρ_{k}]^{l} [ρ_{k}]^{p} ε_{k} (n - 1 - l) ε_{k} (n - 1 - p) .

S (T, k) = n = 1 \sum T l = 1 \sum \infty p = 1 \sum \infty [ρ_{k}]^{l} [ρ_{k}]^{p} ε_{k} (n - 1 - l) ε_{k} (n - 1 - p) .

T \geq 1 in f \frac{S ( T , k )}{T ( n = 1 \sum T - 1 [ ε _{k} ( n ) ] ^{2} + [ ε _{k} ( 0 ) ] ^{2} )}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsControl Systems and Identification · Financial Risk and Volatility Modeling · Statistical Methods and Inference

Full text

Classical and bayesian componentwise predictors for non-compact correlated ARH(1) processes

M. Dolores Ruiz–Medina1 and Javier Álvarez-Liébana1

Summary

A special class of standard Gaussian Autoregressive Hilbertian processes of order one (Gaussian ARH(1) processes), with bounded linear autocorrelation operator, which does not satisfy the usual Hilbert-Schmidt assumption, is considered. To compensate the slow decay of the diagonal coefficients of the autocorrelation operator, a faster decay velocity of the eigenvalues of the trace autocovariance operator of the innovation process is assumed. As usual, the eigenvectors of the autocovariance operator of the ARH(1) process are considered for projection, since, here, they are assumed to be known. Diagonal componentwise classical and bayesian estimation of the autocorrelation operator is studied for prediction. The asymptotic efficiency and equivalence of both estimators is proved, as well as of their associated componentwise ARH(1) plugin predictors. A simulation study is undertaken to illustrate the theoretical results derived.

Published in REVSTAT (in press)

https://www.ine.pt/revstat/pdf/Classicalandbayesiancomponentwise.pdf

1 Department of Statistics and O. R., University of Granada, Spain.

E-mail: [email protected]

Key words: Asymptotic efficiency; autoregressive Hilbertian processes; bayesian estimation; classical moment-based estimation; functional prediction; non-compact bounded autocorrelation operators.

1 Introduction

Functional time series theory plays a key role in the analysis of high-dimensional data (see, for example, Aue et al. [2015], Bosq [2000], Bosq and Blanke [2007]). Inference for stochastic processes can also be addressed from this framework (see Álvarez-Liébana et al. [2016] in relation to functional prediction of the Ornstein–Uhlenbeck process, in an ARH(1) process framework). Bosq [2000] addresses the problem of infinite–dimensional parameter estimation and prediction of ARH(1) processes, in the cases of known and unknown eigenvectors of the autocovariance operator. Alternative projection methodologies have been adopted, for example, in Antoniadis and Sapatinas [2003], in terms of wavelet bases, and Besse and Cardot [1996], in terms of spline bases. The book by Bosq and Blanke [2007] provides a general overview on statistical prediction, including Bayesian predictors, inference by projection and kernel methods, empirical density estimation, and linear processes in high–dimensional spaces (see also Blanke and Bosq [2015] on Bayesian prediction for stochastic processes). Recently, Bosq and Ruiz-Medina [2014] have derived new results on asymptotic efficiency and equivalence of classical and Bayes predictors for $l^{2}$ –valued Poisson process, where, as usual, $l^{2}$ denotes the Hilbert space of square summable sequences. Classical and Bayesian componentwise parameter estimators of the mean function and autocovariance operator, characterizing Gaussian measures in Hilbert spaces, are also compared in terms of their asymptotic efficiency, in that paper.

We first recall that the class of processes studied here could be of interest in applications, for instance, in the context of anomalous physical diffusion processes (see, for example, Gorenflo and Mainardi [2003], Meerschaert et al. [2002], Metzler and Klafter [2004], and the references therein). An interesting example of our framework corresponds to the case of spatial fractal diffusion operator, and regular innovations. Specifically, the class of standard Gaussian ARH(1) processes studied have a bounded linear autocorrelation operator, admitting a weak–sense diagonal spectral representation, in terms of the eigenvectors of the autocovariance operator. The sequence of diagonal coefficients, in such a spectral representation, displays an accumulation point at one. The singularity of the autocorrelation kernel is compensated by the regularity of the autocovariance kernel of the innovation process. Namely, the key assumption here is the summability of the quotient between the eigenvalues of the autocovariance operator of the innovation process and of the ARH(1) process. Under suitable conditions, the asymptotic efficiency and equivalence of the studied diagonal componentwise classical and Bayesian estimators of the autocorrelation operator are derived (see Theorem 4.1 below). Under the same setting of conditions, the asymptotic efficiency and equivalence of the corresponding classical and Bayesian ARH(1) plug–in predictors are proved as well (see Theorem 4.2 below). Although both theorems only refer to the case of known eigenvectors of the autocovariance operator, as illustrated in the simulation study undertaken in Álvarez-Liébana et al. [2017] (see also Álvarez-Liébana [2017], Ruiz-Medina and Álvarez-Liébana [2018a]), a similar performance is obtained for the case of unknown eigenvectors, in comparison with other componentwise, kernel–based, wavelet-based penalized and nonparametric approaches adopted in the current literature (see Antoniadis and Sapatinas [2003], Besse and Cardot [1996], Bosq [2000], Guillas [2001], Mas [1999]).

Note that, for $\theta$ being the unknown parameter, in order to compute ${\rm E}\left\{\theta|X_{1},\ldots,X_{n}\right\},$ with $\left\{X_{1},\ldots,X_{n}\right\}$ denoting the functional sample, we suppose that

[TABLE]

which leads to

[TABLE]

Here, for each $j\geq 1,$ $\theta_{j}=\langle\theta,v_{j}\rangle_{H},$ and $X_{i,j}=\langle X_{i},v_{j}\rangle_{H},$ for each $i=1,\dots,n,$ with $\langle\cdot,\cdot\rangle_{H}$ being the inner product in the real separable Hilbert space $H$ . Note that $\left\{v_{j},\ j\geq 1\right\}$ denotes an orthonormal basis of $H,$ diagonalizing the common autocovariance operator of $\left(X_{1},\ldots,X_{n}\right).$ We can then perform an independent computation of the respective posterior distributions of the projections $\left\{\theta_{j},\ j\geq 1\right\},$ of parameter $\theta,$ with respect to the orthonormal basis $\left\{v_{j},\ j\geq 1\right\}$ of $H.$

Finally, some numerical examples are considered to illustrate the results derived on asymptotic efficiency and equivalence of moment–based classical and Beta–prior–based Bayes diagonal componentwise parameter estimators, and the associated ARH(1) plug–in predictors.

2 Preliminaries

The preliminary definitions and results needed in the subsequent development are introduced in this section. We first refer to the usual class of standard ARH(1) processes introduced in Bosq [2000].

Definition 2.1

Let $H$ be a real separable Hilbert space. A sequence $Y=\left\{Y_{n},\ n\in\mathbb{Z}\right\}$ of $H$ –valued random variables on a basic probability space $(\Omega,\mathcal{A},\mathcal{P})$ is called an autoregressive Hilbertian process of order one, associated with $(\mu,\varepsilon,\rho),$ if it is stationary and satisfies

[TABLE]

where $\varepsilon=\left\{\varepsilon_{n},\ n\in\mathbb{Z}\right\}$ is a Hilbert–valued white noise in the strong sense (i.e., a zero–mean stationary sequence of independent $H-$ valued random variables with ${\rm E}\left\{\|\varepsilon_{n}\|^{2}_{H}\right\}=\sigma^{2}<\infty,$ for every $n\in\mathbb{Z}$ ), and $\rho\in\mathcal{L}(H),$ with $\mathcal{L}(H)$ being the space of linear bounded operators on $H.$ For each $n\in\mathbb{Z},$ $\varepsilon_{n}$ and $X_{n-1}$ are assumed to be uncorrelated.

If there exists a positive $j_{0}\geq 1$ such that $\|\rho^{j_{0}}\|_{\mathcal{L}(H)}<1,$ then, the ARH(1) process in (1) is standard, and there exists a unique stationary solution to equation (1) admitting a MAH( $\infty$ ) representation (see [Bosq, 2000, Theorem 3.1, p. 74]).

The autocovariance and cross–covariance operators are given, for each $n\in\mathbb{Z}$ , by

[TABLE]

where, for $f,g\in H,$

[TABLE]

defines a Hilbert–Schmidt operator on $H.$ The operator $C$ is assumed to be in the trace class. In particular,

[TABLE]

It is well-known that, from equations (1)–(2), for all $h\in H,$ $D(h)=\rho C(h)$ (see, for example, Bosq [2000]). However, since $C$ is a nuclear or trace operator, its inverse operator is an unbounded operator in $H.$ Different methodologies have been adopted to overcome this problem in the current literature on ARH(1) processes. In particular, here, we consider the case where $C(H)=H,$ under Assumption A2 below, since $C$ is assumed to be strictly positive. That is, its eigenvalues are strictily positive and the kernel space of $C$ is trivial. In addition, they are assumed to have multiplicity one. Therefore, for any $f,g\in H,$ there exist $\varphi,\phi\in H$ such that $f=C(\varphi)$ and $g=C(\phi),$ and

[TABLE]

In particular,

[TABLE]

Assumption A1. The operator $\rho$ in (1) is self–adjoint with $\|\rho\|_{\mathcal{L}(H)}<1.$

Assumption A2. The operator $C$ is strictly positive, and its positive eigenvalues have multiplicity one. Furthermore, $C$ and $\rho$ admit the following diagonal spectral decompositions, such that for all $f,g\in H,$

[TABLE]

where $\{C_{k},\ k\geq 1\}$ and $\{\rho_{k},\ k\geq 1\}$ are the respective systems of eigenvalues of $C$ and $\rho,$ and $\{\phi_{k},\ k\geq 1\}$ is the common system of orthonormal eigenvectors of the autocovariance operator $C.$

Remark 2.1

As commented before, we consider here the case where the eigenvectors $\left\{\phi_{k},\ k\geq 1\right\}$ of the autocovariance operator $C$ are known. Thus, under Assumption A2, the natural way to formulate a componentwise estimator of the autocorrelation operator $\rho$ is in terms of the respective estimators of its diagonal coefficients $\left\{\rho_{k},\ k\geq 1\right\},$ computed from the respective projections of the observed functional data, $\left(X_{0},\ldots,X_{T}\right),$ into $\{\phi_{k},\ k\geq 1\}$ . We adopt here a moment–based classical and Beta–prior–based Bayesian approach in the estimation of such coefficients $\left\{\rho_{k},\ k\geq 1\right\}.$

From the Cauchy–Schwarz’s inequality, applying the Parseval’s identity,

[TABLE]

Thus, equation (4) holds in the weak sense.

From Assumption A2, the projection of $X_{n}$ into the common eigenvector system $\{\phi_{k},\ k\geq 1\}$ leads to the following series expansion in $\mathcal{L}^{2}_{H}(\Omega,\mathcal{A},\mathcal{P}):$

[TABLE]

and, for each $j,p\geq 1,$ and $n>0,$

[TABLE]

where the last equality is obtained from the orthonormality of the eigenvectors $\{\phi_{k},\ k\geq 1\}.$ Hence, under Assumptions A1–A2, the projection of equation (1) into the elements of the common eigenvector system $\{\phi_{k},\ k\geq 1\}$ leads to the following infinite-dimensional system of equations:

[TABLE]

or equivalently,

[TABLE]

where

[TABLE]

Thus, for each $j\geq 1,$

[TABLE]

defines a standard AR(1) process. Its moving average representation of infinite order is given by

[TABLE]

Specifically, under Assumption A2,

[TABLE]

where

[TABLE]

From equation (9), under Assumptions A1–A2,

[TABLE]

with, as before,

[TABLE]

Equation (10) leads to the identity

[TABLE]

from which, we obtain

[TABLE]

Under (11), equation (7) can also be rewritten as

[TABLE]

Assumption A2B. The sequences

[TABLE]

satisfy

[TABLE]

Equation (13) means that $\left\{\sigma^{2}_{k},\ k\geq 1\right\}$ and $\left\{C_{k},\ k\geq 1\right\}$ are both summable sequences, with faster decay to zero of the sequence $\left\{\sigma^{2}_{k},\ k\geq 1\right\}$ than the sequence $\left\{C_{k},\ k\geq 1\right\},$ leading, from equations (11)–(12), to the definition of $\left\{\rho_{k}^{2},\ k\geq 1\right\}$ as a sequence with accumulation point at one.

Remark 2.2

Under Assumption A2B and A3 below holds.

For each $k\geq 1,$ from equations (6)–(8),

[TABLE]

where

[TABLE]

Hence, $\displaystyle\sum_{n=1}^{T}[\varepsilon_{k}(n-1)]^{2}+S(T,k)\geq 0,$ for every $T\geq 1,$ and $k\geq 1.$

Assumption A3. There exists a sequence of real-valued independent random variables $\left\{\widetilde{M}(k),\ k\geq 1\right\}$ such that

[TABLE]

with

[TABLE]

Remark 2.3

Note that the mean value of

[TABLE]

is of order $\frac{T\sigma_{k}^{2}}{1-(\rho_{k})^{2}},$ and the mean value of

[TABLE]

is of order $T(T-1)\sigma_{k}^{2}.$ Hence, for the almost surely boundedness of the inverse of

[TABLE]

by a suitable sequence of random variables with summable $l$ –moments, for $l=1,2,3,4,$ the eigenvalues of operator $\rho$ must be close to one but strictly less than one. As commented in Remark 2.2, from Assumption A2B, this condition is satisfied in view of equation (12).

Assumption A4. ${\rm E}\left\{\eta_{j}(m)\eta_{k}(n)\right\}=\delta_{j,k},$ with, as before, $\delta_{j,k}$ denoting the Kronecker delta function, for every $m,n\in\mathbb{Z},$ and $j,k\geq 1.$

Remark 2.4

Assumption A4* implies that the cross–covariance operator $D$ admits a diagonal spectral decomposition in terms of the system of eigenvectors $\left\{\phi_{k},\ k\geq 1\right\}.$ Thus, under Assumption A4, the diagonal spectral decompositions (3)–(4) also hold.*

The classical diagonal componentwise estimator $\widehat{\rho}_{T}$ of $\rho$ considered here is given by

[TABLE]

From equations (6)–(7) and (11), for each $k\geq 1,$

[TABLE]

Remark 2.5

It is important to note that, for instance, unconditional bases, like wavelets, provide the spectral diagonalization of an extensive family of operators, including pseudodifferential operators, and in particular, Calderón–Zygmund operators (see Kyriazis and Petrushev [2001], Meyer and Coifman [1997]). Therefore, the diagonal spectral representations (3)–(4), in Assumption A2, hold for a wide class of autocovariance and cross-covariance operators, for example, in terms of wavelets. When the autocovariance and the cross–covariance operators are related by a continuous function, the diagonal spectral representations (3)–(4) are also satisfied (see [Dautray and Lions, 1990, pp. 119, 126 and 140]). Assumption A2 has been considered, for example, in [Bosq, 2000, Theorem 8.5, pp. 215–216; Theorem 8.7, p. 221], to establish strong consistency, although, in this book, a different setting of conditions is assumed. Thus, Assumptions A1–A2 already have been used (e.g., in Bosq [2000], Álvarez-Liébana et al. [2017], Ruiz-Medina and Álvarez-Liébana [2018a]), and Assumptions A2B, A3 and A4 appear in Ruiz-Medina et al. [2016]. Assumptions A2B is needed since the usual assumption on the Hilbert–Schmidt property of $\rho,$ made by several authors, is not considered here. At the same type, as commented before, Assumptions A2B implies Assumption A3.

The following lemmas will be used in the derivation of the main results of this paper, Theorems 4.1 and 4.2, obtained in the Gaussian ARH(1) context.

Lemma 2.1

Let $\left\{\mathcal{X}_{i},\ i=1,\dots,n\right\},$ be the values of a standard zero–mean autoregressive process of order one (AR(1) process) at times $i=1,\dots,n,$ and

[TABLE]

with $\mathcal{X}_{1}$ representing the random initial condition. Assume that $|\rho|<1,$ and that the innovation process is white noise. Then, as $n\rightarrow\infty,$

[TABLE]

The proof of Lemma 2.1 can be found in [Hamilton, 1994, p. 216].

Lemma 2.2

Let $\mathcal{X}_{1}$ and $\mathcal{X}_{2}$ be two normal distributed random variables having correlation $\rho_{\mathcal{X}_{1}\mathcal{X}_{2}},$ and with means $\mu_{1}$ and $\mu_{2},$ and variances $\sigma_{1}^{2}$ and $\sigma_{2}^{2},$ respectively. Then, the following identities hold:

[TABLE]

(see, for example, Aroian [1947], Ware and Lad [2003]).

Lemma 2.3

For each $k\geq 1,$ the following limit is obtained:

[TABLE]

(see, for example, Bartlett [1946]).

3 Bayesian diagonal componentwise estimation

Now let us denote by $R$ the functional random variable on the basic probability space $(\Omega,\mathcal{A},\mathcal{P}),$ characterized by the prior distribution for $\rho.$ In our case, we assume that $R$ is of the form

[TABLE]

where, for $k\geq 1,$ $R_{k}$ is a real–valued random variable such that $R(\phi_{j})(\phi_{k})=\delta_{j,k}R_{k},$ almost surely, for every $j\geq 1.$ In the following, $R_{k}$ is assumed to follow a beta distribution with shape parameters $a_{k}>0$ and $b_{k}>0$ ; i.e., $R_{k}\sim\mathcal{B}(a_{k},b_{k}),$ for every $k\geq 1.$ We also assume that $R$ is independent of the functional components of the innovation process $\left\{\varepsilon_{n},\ n\in\mathbb{Z}\right\},$ and that the random variables $\left\{R_{k},\ k\geq 1\right\},$ are globally independent. That is, for each $f,g\in H,$

[TABLE]

Thus,

[TABLE]

where the last identity is understood in the weak–sense; i.e., in the sense of equation (19).

In the definition of $R$ from $\{R_{j},\ j\geq 1\},$ we can then apply the Kolmogorov extension Theorem under the condition

[TABLE]

(see, for example, Khoshnevisan [2007]).

As in the real–valued case (see Supplementary Material 7), considering $b_{j}>1,$ for each $j\geq 1,$ the Bayes estimator of $\rho$ is defined by (see Case 2 in Supplementary Material 7)

[TABLE]

with, for every $j\geq 1,$

[TABLE]

where

[TABLE]

4 Asymptotic efficiency and equivalence

In this section, sufficient conditions are derived to ensure the asymptotic efficiency and equivalence of the diagonal componentwise estimators of $\rho$ formulated in the classical (see equation (15)), and in the Bayesian (see equations (20)–(22)) frameworks.

Theorem 4.1

Under Assumptions A1–A2, A2B, A3 and A4, let us assume that the ARH(1) process $X$ satisfies, for each $j\geq 1,$ and, for every $T\geq 2,$

[TABLE]

That is, $\{\varepsilon_{j}(i),\ i\geq 1\}$ and $\{X_{i-1,j},\ i\geq 0\}$ are almost surely positive empirically correlated. In addition, for every $j\geq 1,$ the hyper–parameters $a_{j}$ and $b_{j}$ of the beta prior distribution, $\mathcal{B}(a_{j},b_{j}),$ are such that $a_{j}+b_{j}\geq 2.$ Then, the following identities are obtained:

[TABLE]

where $\widehat{\rho}_{T}$ is defined in equation (15), and $\widetilde{\rho}_{T}^{-}$ is defined from equations (20)–(22), considering

[TABLE]

with, as before, for each $j\geq 1,$

[TABLE]

and $\alpha_{j,T}$ and $\beta_{j,T}$ are given in (22), for every $T\geq 2.$

Proof. Under Assumptions A1–A2, from Remark 8.1 and Corollary 8.1 in Supplementary Material 8, for each $j\geq 1,$ and for $T$ sufficiently large,

[TABLE]

Also, under (23),

[TABLE]

which is equivalent to

[TABLE]

for every $j\geq 1.$

From (26), to obtain the following a.s. inequality:

[TABLE]

it is sufficient that

[TABLE]

which is equivalent to

[TABLE]

That is, keeping in mind that

[TABLE]

condition (28) can also be expressed as

[TABLE]

i.e.,

[TABLE]

for $j\geq 1.$ Since, for each $j\geq 1,$

[TABLE]

it is sufficient that

[TABLE]

to hold to ensure that inequality (27) is satisfied. Furthermore, from Remark 8.1 and Corollary 8.1, in Supplementary Material 8, for each $j\geq 1,$ $\beta_{j,T}\rightarrow\infty,$ and

[TABLE]

Also, we have, from such remark and theorem, that

[TABLE]

Thus, for each $j\geq 1,$ the upper bound, in (29), diverges as $T\rightarrow\infty,$ which means, that, for $T$ sufficiently large, inequality (27) holds, if $a_{j}+b_{j}\geq 2,$ for each $j\geq 1.$ Now, from (27), under Assumption A3, for each $j\geq 1,$

[TABLE]

Furthermore, for each $j\geq 1,$ $\beta_{j,T}\rightarrow\infty,$ and $\beta_{j,T}=\mathcal{O}(T),$ as $T\rightarrow\infty,$ almost surely. Hence,

[TABLE]

From equation (25), we then have that, for each $j\geq 1,$

[TABLE]

almost surely. Thus, the almost surely convergence, when $T\rightarrow\infty,$ of $\widetilde{\rho}_{j,T}^{-}$ and $\widehat{\rho}_{j,T}$ to the same limit is obtained, for every $j\geq 1.$

From equation (30),

[TABLE]

Since ${\rm E}\left\{\widetilde{M}^{2}(j)\right\}<\infty,$ applying the Dominated Convergence Theorem, from equation (32), considering (18) we obtain, for each $j\geq 1,$

[TABLE]

Under Assumptions A3, from (30), for each $j\geq 1,$ and for every $T\geq 1,$

[TABLE]

with

[TABLE]

Applying again the Dominated Convergence Theorem (with integration performed with respect to a counting measure), we obtain from (33), keeping in mind relationship (12),

[TABLE]

in view of equation (13) in Assumption A2B. That is, equation (24) holds.

$\blacksquare$

Theorem 4.2

Under the conditions of Theorem 4.1,

[TABLE]

Here,

[TABLE]

Proof.

From equation (LABEL:A5:eqfconv), for every $j,k\geq 1,$

[TABLE]

In addition, from equation (32), for every $j,k\geq 1,$

[TABLE]

with

[TABLE]

under Assumption A3. Applying the Dominated Convergence Theorem from (36), the almost surely convergence in (35) implies the convergence in mean to zero, when $T\rightarrow\infty.$ Furthermore, under Assumption A3, for $T\geq 2,$

[TABLE]

From (37), for every $T\geq 2,$

[TABLE]

Equation (38) means that the rate of convergence to zero, as $T\rightarrow\infty,$ of the functional sequence $\left\{\widetilde{\rho}^{-}_{T}-\widehat{\rho}_{T},\ T\geq 2\right\}$ in the space $\mathcal{L}_{\mathcal{S}(H)}^{4}(\Omega,\mathcal{A},P)$ is of order $T^{-2}.$

From definition of the norm in the space bounded linear operators, applying the Cauchy–Schwarz’s inequality, we obtain

[TABLE]

From the orthogonal expansion (5) of $X_{T}$ , in terms of the independent real–valued standard Gaussian random variables $\left\{\eta_{k}(T),\ k\geq 1\right\},$ we have

[TABLE]

From equations (38)–(40),

[TABLE]

Thus, $\widetilde{\rho}_{T}^{-}(X_{T})$ and $\widehat{\rho}_{T}(X_{T})$ have the same limit in the space $\mathcal{L}_{H}^{2}(\Omega,\mathcal{A},\mathcal{P}).$

We now prove the approximation by ${\rm Tr}\left(C\left(I-\rho^{2}\right)\right)$ of the limit, in equation (34). Consider

[TABLE]

where

[TABLE]

From Lemmas 2.1– 2.2 (see the last identity in equation (17)), for each $k\geq 1,$ and for $T$ sufficiently large,

[TABLE]

Under Assumption A3, from equations (14)–(16), for every $k\geq 1,$

[TABLE]

From equations (41)–(43),

[TABLE]

since

[TABLE]

by the trace property of $C.$ Here, we have applied the Cauchy–Schwarz’s inequality to obtain, for a certain constant $L>0,$

[TABLE]

from the trace property of $C,$ and since

[TABLE]

under Assumption A3.

From equations (18) and (44), one can get, applying the Dominated Convergence Theorem,

[TABLE]

where we have considered that

[TABLE]

$\blacksquare$

5 Numerical examples

This section illustrates the theoretical results derived on asymptotic efficiency and equivalence of the proposed classical and Bayesian diagonal componentwise estimators of the autocorrelation operator, as well as of the associated ARH(1) plug–in predictors. Under the conditions assumed in Theorem 4.1, three examples of standard zero–mean Gaussian ARH(1) processes are generated, respectively corresponding to consider different rates of convergence to zero of the eigenvalues of the autocovariance operator. The truncation order $k_{T}$ in Examples 1–2 (see Sections 5.1–5.2) is fixed; i.e., it does not depend on the sample size $T$ (see equations (46)–(47) below). While in Example 3 (see Section 5.3), $k_{T}$ is selected such that

[TABLE]

Specifically, in the first two examples, the choice of $k_{T}$ is driven looking for a compromise between the sample size and the number of parameters to be estimated. With this aim the value $k_{T}=5$ is fixed, independently of $T.$ This is the number of parameters that can be estimated in an efficient way, from most of the values of the sample size $T$ studied. In Example 3, the truncation parameter $k_{T}$ is defined as a fractional power of the sample size. Note that Example 3 corresponds to the fastest decay velocity of the eigenvalues of the autocovariance operator. Hence, the lowest truncation order for a given sample size must be selected according to the truncation rule (45).

The generation of $N=1000$ realizations of the functional values $\left\{X_{t},\ t=0,1,\dots,T\right\}$ , for

[TABLE]

denoting as before the sample size, is performed, for each one of the ARH(1) processes, defined in the three examples below. Based on those generations, and on the sample sizes studied, the truncated empirical functional mean-square errors of the classical and Bayes diagonal componentwise parameter estimators of the autocorrelation operator $\rho$ are computed as follows:

[TABLE]

where $\overline{\rho}_{j,T}^{\omega}$ can be the classical $\widehat{\rho}_{j,T}$ or the Bayes $\widetilde{\rho}_{j,T}$ diagonal componentwise estimator of the autocorrelation operator, and ω denotes the sample point $\omega\in\Omega$ associated with each one of the $N=1000$ realizations generated of each functional value of the ARH(1) process $X.$

On the other hand, as assumed in the previous section,

[TABLE]

for each $k\geq 1$ . Thus, parameters $\left(a_{k},b_{k}\right)$ are defined as follows:

[TABLE]

where

[TABLE]

with $\left\{\rho_{k}^{2},\ k\geq 1\right\}$ being a random sequence such that its elements tend to be concentrated around point one, when $k\rightarrow\infty.$ From (49), since

[TABLE]

Assumption A2B is satisfied. In addition, condition (23) is verified in the generations performed in the Gaussian framework.

Example 1

Let us assume that the eigenvalues of the autocovariance operator of the ARH(1) process $X$ are given by

[TABLE]

Thus, $C$ is a strictly positive and trace operator, where

[TABLE]

are generated from (48)–(50).

Tables 1–2 display the values of the empirical functional mean–square errors, given in (46)–(47), associated with $\widehat{\rho}_{T}$ and $\widetilde{\rho}^{-}_{T},$ and with the corresponding ARH(1) plug–in predictors, with, as before,

[TABLE]

considering $k_{T}=5.$ The respective graphical representations are displayed in Figures 1–2, where, for comparative purposes, the values of the curve $1/T$ are also drawn for the finite sample sizes (51).

Example 2

In this example, a bit slower decay velocity, than in Example 1, of the eigenvalues of the autocovariance operator of the ARH(1) process is considered. Specifically,

[TABLE]

Thus, $C$ is a strictly positive self-adjoint trace operator, where $\left\{\rho_{k}^{2},\ k\geq 1\right\}$ and $\left\{\sigma_{k}^{2},\ k\geq 1\right\}$ are generated, as before, from (48)-(50).

Tables 3–4 show the values of the empirical functional mean–square errors, associated with $\widehat{\rho}_{T}$ and $\widetilde{\rho}^{-}_{T},$ and with the corresponding ARH(1) plug–in predictors, respectively. Figures 3–4 provide the graphical representations in comparison with the values of the curve $1/T$ for $T$ given in (51), with, as before, $k_{T}=5$ .

Example 3

It is well–known that the singularity of the inverse of the autocovariance operator $C$ increases, when the rate of convergence to zero of the eigenvalues of $C$ indicates a faster decay velocity, as in this example. Specifically, here,

[TABLE]

As before, $\left\{\rho_{k}^{2},\ k\geq 1\right\}$ and $\left\{\sigma_{k}^{2},\ k\geq 1\right\}$ are generated from (48)-(50). The truncation order $k_{T}$ satisfies

[TABLE]

(see also the simulation study undertaken in Álvarez-Liébana et al. [2017], for the case of $\rho$ being a Hilbert–Schmidt operator). In particular, (52) holds for $\frac{1}{2}-\frac{2}{\alpha}>0.$ Thus, $\alpha>4,$ and we consider $\alpha=4.1,$ i.e., $k_{T}=\lceil T^{1/4.1}\rceil.$

Tables 5–6 show the empirical functional mean–square errors associated with $\widehat{\rho}_{T}$ and $\widetilde{\rho}^{-}_{T},$ and with the corresponding ARH(1) plug–in predictors, respectively. As before, Figures 5–6 provide the graphical representations, and the values of the curve $1/T,$ for $T$ in (51), with the aim of illustrating the rate of convergence to zero of the truncated empirical functional mean quadratic errors.

In Examples 1–2 in Sections 5.1–5.2, where a common fixed truncation order is considered, we can observe that the biggest values of the empirical functional mean–square errors are located at the smallest sample sizes, for which the number $k_{T}=5$ of parameters to be estimated is too large, with a slightly worse performance for those sample sizes, in Example 3 in Seciton 5.2, where a slower decay velocity, than in Example 1, of the eigenvalues of the autocovariance operator $C$ is considered. Note that, on the other hand, when a slower decay velocity of the eigenvalues of $C$ is given, a larger truncation order is required to explain a given percentage of the functional variance. For the fastest rate of convergence to zero of the eigenvalues of the autocovariance operator $C,$ in Example 3, to compensate the singularity of the inverse covariance operator $C^{-1},$ a suitable truncation order $k_{T}$ is fitted, depending on the sample size $T,$ obtaining a slightly better performance than in the previous cases, where a fixed truncation order is studied.

6 Final comments

This paper addresses the case where the eigenvectors of $C$ are known, in relation to the asymptotic efficiency and equivalence of $\widehat{\rho}_{j,T}$ and $\widetilde{\rho}_{j,T}^{-},$ and the associated plug-in predictors. However, as shown in the simulation study undertaken in Álvarez-Liébana et al. [2017], a similar performance is obtained in the case where the eigenvectors of $C$ are unknown (see also Bosq [2000] in relation to the asymptotic properties of the empirical eigenvectors of $C$ ).

In the cited references in the ARH(1) framework, the autocorrelation operator is usually assumed to belong to the Hilbert–Schmidt class. Here, in the absence of the compactness assumption (in particular, of the Hilbert–Schmidt assumption) on the autocorrelation operator $\rho,$ singular autocorrelation kernels can be considered. As commented in the Section 1, the singularity of $\rho$ is compensated by the regularity of the autocovariance kernel of the innovation process, as reflected in Assumption A2B.

Theorem 4.1 establishes sufficient conditions for the asymptotic efficiency and equivalence of the proposed classical and Bayes diagonal componentwise parameter estimators of $\rho,$ as well as of the associated ARH(1) plug-in predictors (see Theorem 4.2). The simulation study illustrates the fact that the truncation order $k_{T}$ should be selected according to the rate of convergence to zero of the eigenvalues of the autocovariance operator, and depending on the sample size $T.$ Although, a fixed truncation order, independently of $T,$ has also been tested in Examples 1–2, where a compromise between the rate of convergence to zero of the eigenvalues, and the rate of increasing of the sample sizes is found.

7 Supplementary Material: Bayesian estimation of real–valued autoregressive processes of order one

In this section, we consider the Beta–prior–based Bayesian estimation of the autocorrelation coefficient $\rho$ in a standard AR(1) process. Namely, the generalized maximum likelihood estimator of such a parameter is computed, when a beta prior is assumed for $\rho.$ In the ARH(1) framework, we have adopted this estimation procedure in the approximation of the diagonal coefficients $\{\rho_{k},\ k\geq 1\}$ of operator $\rho$ with respect to $\{\phi_{k}\otimes\phi_{k},\ k\geq 1\},$ in a Bayesian componentwise context. Note that we also denote by $\rho$ the autocorrelation coefficient of an AR(1) process, since there is no place for confusion here.

Let $\{X_{n},\ n\in\mathbb{Z}\}$ be an AR(1) process satisfying

[TABLE]

where $0<\rho<1,$ and $\{\varepsilon_{n},\ n\in\mathbb{Z}\}$ is a real–valued Gaussian white noise; i.e., $\varepsilon_{n}\sim\mathcal{N}(0,\sigma^{2}),$ $n\in\mathbb{Z},$ are independent Gaussian random variables, with $\sigma>0.$ Here, we will use the conditional likelihood, and assume that $(x_{1},\dots,x_{n})$ are observed for $n$ sufficiently large to ensure that the effect of the random initial condition is negligible. A beta distribution with shape parameters $a>0$ and $b>0$ is considered as a-priori distribution on $\rho,$ i.e., $\rho\sim\mathcal{B}(a,b).$ Hence, the distribution of $(x_{1},\dots,x_{n},\rho)$ has density

[TABLE]

where

[TABLE]

is the beta function.

We first compute the solution to the equation

[TABLE]

where

[TABLE]

Thus, the following equation must be solved:

[TABLE]

Case 1

Considering $a=b=1,$ and $\sigma^{2}=1,$ we obtain the solution

[TABLE]

Case 2

The general case where $b>1$ is more intricate, since the solutions are $\widetilde{\rho}_{n}=0,$ and

[TABLE]

Case 3

For $\sigma^{2}=a=1,$ we have

[TABLE]

8 Supplementary Material 2: strong–ergodic AR(1) processes

This section collects some strong–ergodicity results applied in this paper, for real–valued weak–dependent random sequences. In particular, their application to the AR(1) case is considered.

A real–valued stationary process $\left\{Y_{n},\ n\in\mathbb{Z}\right\}$ is strongly–ergodic (or ergodic in an almost surely sense), with respect to ${\rm E}\left\{f\left(Y_{0},\ldots,Y_{n-1}\right)\right\}$ if, as $n\rightarrow\infty,$

[TABLE]

In particular, the following lemma provides sufficient condition to get the strong–ergodicity for all second–order moments (see, for example, [Stout, 1974, Theorem 3.5.8] and [Billingsley, 1995, p. 495]).

Lemma 8.1

Let $\left\{\widetilde{\varepsilon}_{n},\ n\in\mathbb{Z}\right\}$ be an i.i.d. sequence of real–valued random variables. If $f:\leavevmode\nobreak\ \mathbb{R}^{\infty}\longrightarrow\mathbb{R}$ is a measurable function, then

[TABLE]

is a stationary and strongly–ergodic process for all second–order moments.

Lemma 8.1 is now applied to the invertible AR(1) case, when the innovation process is white noise.

Remark 8.1

If $\left\{Y_{n},\ n\in\mathbb{Z}\right\}$ is a real–valued zero–mean stationary AR(1) process

[TABLE]

where $\left\{\widetilde{\varepsilon}_{n},\ n\in\mathbb{Z}\right\}$ is strong white noise, we can define the measurable (even continuous) function

[TABLE]

such that, from Lemma 8.1 and for each $n\in\mathbb{Z}$ ,

[TABLE]

is a stationary and strongly–ergodic process for all second–order moments.

In the results derived in this paper, Remark 8.1 is applied, for each $j\geq 1,$ to the real–valued zero–mean stationary AR(1) processes

[TABLE]

with $\{X_{n},\ n\in\mathbb{Z}\}$ now representing an ARH(1) process.

Corollary 8.1

Under Assumptions A1–A2, for each $j\geq 1,$ let us consider the real–valued zero–mean stationary AR(1) process $\left\{X_{n,j}=\left\langle X_{n},\phi_{j}\right\rangle_{H},\ n\in\mathbb{Z}\right\}$ , such that, for each $n\in\mathbb{Z}$

[TABLE]

Here, $\left\{\varepsilon_{n,j},\ n\in\mathbb{Z}\right\}$ is a real-valued strong white noise, for any $j\geq 1$ . Thus, for each $j\geq 1$ , $\left\{X_{n,j},\ n\in\mathbb{Z}\right\}$ is a stationary and strongly-ergodic process for all second-order moments. In particular, for any $j\geq 1$ , as $n\rightarrow\infty,$

[TABLE]

Acknowledgments

This work has been supported in part by projects MTM2012–32674 and MTM2015–71839–P (co-funded by Feder funds), of the DGI, MINECO, Spain.

Bibliography27

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Álvarez-Liébana [2017] \NAT@biblabelnum Álvarez-Liébana 2017 Álvarez-Liébana , J.: Functional time series: a review and comparative study. Submitted (2017)
2Álvarez-Liébana et al. [2016] \NAT@biblabelnum Álvarez-Liébana et al. 2016 Álvarez-Liébana , J. ; Bosq , D. ; Ruiz-Medina , M. D.: Consistency of the plug-in functional predictor of the Ornstein-Uhlenbeck in Hilbert and Banach spaces. Statist. Probab. Lett. 117 (2016), pp. 12–22. – DOI: doi.org/10.1016/j.spl.2016.04.023
3Álvarez-Liébana et al. [2017] \NAT@biblabelnum Álvarez-Liébana et al. 2017 Álvarez-Liébana , J. ; Bosq , D. ; Ruiz-Medina , M. D.: Asymptotic properties of a componentwise ARH(1) plug-in predictor. J. Multivariate Anal. 155 (2017), pp. 12–34. – DOI: doi.org/10.1016/j.jmva.2016.11.009
4Antoniadis and Sapatinas [2003] \NAT@biblabelnum Antoniadis and Sapatinas 2003 Antoniadis , A. ; Sapatinas , T.: Wavelet methods for continuous-time prediction using Hilbert-valued autoregressive processes. J. Multivariate Anal. 87 (2003), pp. 133–158. – DOI: doi.org/10.1016/S 0047-259X(03)00028-9
5Aroian [1947] \NAT@biblabelnum Aroian 1947 Aroian , L. A.: The probability function of a product of two normal distributed variables. Ann. Math. Statist. 18 (1947), pp. 256–271. – DOI: doi.org/10.1214/aoms/1177730442
6Aue et al. [2015] \NAT@biblabelnum Aue et al. 2015 Aue , A. ; Norinho , D. ; Hörmann , S.: On the prediction of stationary functional time series. J. Amer. Statist. Assoc. 110 (2015), pp. 378–392. – DOI: doi.org/10.1080/01621459.2014.909317
7Bartlett [1946] \NAT@biblabelnum Bartlett 1946 Bartlett , M. S.: On the theoretical specification and sampling properties of autocorrelated time series. Supplement to J. Roy. Stat. Soc. 8 (1946), pp. 27–41. – URL http://www.jstor.org/stable/2983611
8Besse and Cardot [1996] \NAT@biblabelnum Besse and Cardot 1996 Besse , P. C. ; Cardot , H.: Approximation spline de la prévision d’un processu fonctionnel autoregréssif d’ordre 1. Canad. J. Statist. 24 (1996), pp. 467–487. – DOI: doi.org/10.2307/3315328

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Classical and bayesian componentwise predictors for non-compact correlated ARH(1) processes

Summary

1 Introduction

2 Preliminaries

Definition 2.1

Remark 2.1

Remark 2.2

Remark 2.3

Remark 2.4

Remark 2.5

** Lemma 2.1**

** Lemma 2.2**

** Lemma 2.3**

3 Bayesian diagonal componentwise estimation

4 Asymptotic efficiency and equivalence

** Theorem 4.1**

** Theorem 4.2**

5 Numerical examples

Example 1

Example 2

Example 3

6 Final comments

7 Supplementary Material: Bayesian estimation of real–valued autoregressive processes of order one

8 Supplementary Material 2: strong–ergodic AR(1) processes

** Lemma 8.1**

Remark 8.1

** Corollary 8.1**

Acknowledgments

Lemma 2.1

Lemma 2.2

Lemma 2.3

Theorem 4.1

Theorem 4.2

Lemma 8.1

Corollary 8.1