Asymptotics for Spherical Functional Autoregressions

Alessia Caponera; Domenico Marinucci

arXiv:1907.05802·math.ST·July 15, 2019

Asymptotics for Spherical Functional Autoregressions

Alessia Caponera, Domenico Marinucci

PDF

TL;DR

This paper studies spherical functional autoregressive processes, establishing consistency, a central limit theorem, and weak convergence results for their estimation, supported by numerical validation.

Contribution

It introduces new asymptotic results for spherical functional autoregressions, including consistency and limit theorems, under specific regularity conditions.

Findings

01

Established consistency in sup and mean-square norms

02

Proved a quantitative central limit theorem in Wasserstein distance

03

Validated results with numerical experiments

Abstract

In this paper, we investigate a class of spherical functional autoregressive processes, and we discuss the estimation of the corresponding autoregressive kernels. In particular, we first establish a consistency result (in sup and mean-square norm), then a quantitative central limit theorem (in Wasserstein distance), and finally a weak convergence result, under more restrictive regularity conditions. Our results are validated by a small numerical investigation.

Equations452

T (x) = ℓ = 0 \sum \infty T_{ℓ} (x), T_{ℓ} (x) = m = - ℓ \sum ℓ a_{ℓ, m} Y_{ℓ, m} (x),

T (x) = ℓ = 0 \sum \infty T_{ℓ} (x), T_{ℓ} (x) = m = - ℓ \sum ℓ a_{ℓ, m} Y_{ℓ, m} (x),

Δ_{S^{2}} Y_{ℓ, m} = - ℓ (ℓ + 1) Y_{ℓ, m}, Δ_{S^{2}} := \frac{1}{sin ϑ} \frac{\partial}{\partial ϑ} (sin ϑ \frac{\partial}{\partial ϑ}) + \frac{1}{sin ^{2} ϑ} \frac{\partial}{\partial φ ^{2}};

Δ_{S^{2}} Y_{ℓ, m} = - ℓ (ℓ + 1) Y_{ℓ, m}, Δ_{S^{2}} := \frac{1}{sin ϑ} \frac{\partial}{\partial ϑ} (sin ϑ \frac{\partial}{\partial ϑ}) + \frac{1}{sin ^{2} ϑ} \frac{\partial}{\partial φ ^{2}};

E [a_{ℓ, m} a_{ℓ^{'}, m^{'}}] = C_{ℓ} δ_{ℓ}^{ℓ^{'}} δ_{m}^{m^{'}};

E [a_{ℓ, m} a_{ℓ^{'}, m^{'}}] = C_{ℓ} δ_{ℓ}^{ℓ^{'}} δ_{m}^{m^{'}};

a_{ℓ, m} := \int_{S^{2}} T (x) Y_{ℓ, m} (x) d x .

a_{ℓ, m} := \int_{S^{2}} T (x) Y_{ℓ, m} (x) d x .

m = - ℓ \sum ℓ Y_{ℓ, m} (x) Y_{ℓ, m} (y) = \frac{2 ℓ + 1}{4 π} P_{ℓ} (⟨ x, y ⟩),

m = - ℓ \sum ℓ Y_{ℓ, m} (x) Y_{ℓ, m} (y) = \frac{2 ℓ + 1}{4 π} P_{ℓ} (⟨ x, y ⟩),

P_{ℓ} (t) = \frac{1}{2 ^{ℓ} ℓ !} \frac{d ^{ℓ}}{d t ^{ℓ}} (t^{2} - 1)^{ℓ}, t \in [- 1, 1], ℓ \geq 0 .

P_{ℓ} (t) = \frac{1}{2 ^{ℓ} ℓ !} \frac{d ^{ℓ}}{d t ^{ℓ}} (t^{2} - 1)^{ℓ}, t \in [- 1, 1], ℓ \geq 0 .

\int_{S^{2}} \frac{2 ℓ + 1}{4 π} P_{ℓ} (⟨ x, y ⟩) \frac{2 ℓ + 1}{4 π} P_{ℓ} (⟨ y, z ⟩) d y = \frac{2 ℓ + 1}{4 π} P_{ℓ} (⟨ x, z ⟩) .

\int_{S^{2}} \frac{2 ℓ + 1}{4 π} P_{ℓ} (⟨ x, y ⟩) \frac{2 ℓ + 1}{4 π} P_{ℓ} (⟨ y, z ⟩) d y = \frac{2 ℓ + 1}{4 π} P_{ℓ} (⟨ x, z ⟩) .

Γ (x, y)

Γ (x, y)

= ℓ = 0 \sum \infty C_{ℓ} \frac{2 ℓ + 1}{4 π} P_{ℓ} (⟨ x, y ⟩), for all x, y \in S^{2} .

H_{q} (x) := (- 1)^{q} e^{x^{2} /2} \frac{d ^{q}}{d x ^{q}} e^{- x^{2} /2}, x \in R;

H_{q} (x) := (- 1)^{q} e^{x^{2} /2} \frac{d ^{q}}{d x ^{q}} e^{- x^{2} /2}, x \in R;

G (X) = q = 0 \sum \infty J_{q} (G) \frac{H _{q} ( X )}{q !}, for G s.t. E G^{2} (X) < \infty, J_{q} (G) := E G (X) H_{q} (X);

G (X) = q = 0 \sum \infty J_{q} (G) \frac{H _{q} ( X )}{q !}, for G s.t. E G^{2} (X) < \infty, J_{q} (G) := E G (X) H_{q} (X);

L^{2} (Ω) = q = 0 ⨁ \infty H_{q},

L^{2} (Ω) = q = 0 ⨁ \infty H_{q},

d_{W} (X, Y) = h (\cdot) : ∥ h ∥_{Lip} \leq 1 sup ∣ E [h (X)] - E [h (Y)] ∣, where ∥ h ∥_{Lip} = x \neq = y x, y \in R^{d} sup \frac{∣ h ( x ) - h ( y ) ∣}{∥ x - y ∥},

d_{W} (X, Y) = h (\cdot) : ∥ h ∥_{Lip} \leq 1 sup ∣ E [h (X)] - E [h (Y)] ∣, where ∥ h ∥_{Lip} = x \neq = y x, y \in R^{d} sup \frac{∣ h ( x ) - h ( y ) ∣}{∥ x - y ∥},

d_{W} (X, Y) \leq E ∥ X - Y ∥ .

d_{W} (X, Y) \leq E ∥ X - Y ∥ .

d_{W}(Z,Z_{k})\leq\frac{1}{\sigma}\sqrt{\frac{2q-2}{3\pi q}\big{(}\mathbb{E}[Z_{k}^{4}]-3\sigma^{4}\big{)}}\ ,

d_{W}(Z,Z_{k})\leq\frac{1}{\sigma}\sqrt{\frac{2q-2}{3\pi q}\big{(}\mathbb{E}[Z_{k}^{4}]-3\sigma^{4}\big{)}}\ ,

Γ (x, t, y, s) = Γ_{0} (⟨ x, y ⟩, t - s), \forall (x, t), (y, s) \in S^{2} \times Z .

Γ (x, t, y, s) = Γ_{0} (⟨ x, y ⟩, t - s), \forall (x, t), (y, s) \in S^{2} \times Z .

T_{t} (x) = ℓ = 0 \sum \infty m = - ℓ \sum ℓ a_{ℓ, m} (t) Y_{ℓ, m} (x),

T_{t} (x) = ℓ = 0 \sum \infty m = - ℓ \sum ℓ a_{ℓ, m} (t) Y_{ℓ, m} (x),

E [a_{ℓ, m} (t) a_{ℓ^{'}, m^{'}} (s)] = C_{ℓ} (t - s) δ_{ℓ}^{ℓ^{'}} δ_{m}^{m^{'}}, t, s \in Z .

E [a_{ℓ, m} (t) a_{ℓ^{'}, m^{'}} (s)] = C_{ℓ} (t - s) δ_{ℓ}^{ℓ^{'}} δ_{m}^{m^{'}}, t, s \in Z .

Γ (x, t, y, s) = ℓ = 0 \sum \infty C_{ℓ} (t - s) \frac{2 ℓ + 1}{4 π} P_{ℓ} (⟨ x, y ⟩) .

Γ (x, t, y, s) = ℓ = 0 \sum \infty C_{ℓ} (t - s) \frac{2 ℓ + 1}{4 π} P_{ℓ} (⟨ x, y ⟩) .

T_{t} (x) = \int_{- π}^{π} ℓ = 0 \sum \infty m = - ℓ \sum ℓ exp (- iλ t) Y_{ℓ, m} (x) d W_{ℓ, m} (λ), in L^{2} (Ω),

T_{t} (x) = \int_{- π}^{π} ℓ = 0 \sum \infty m = - ℓ \sum ℓ exp (- iλ t) Y_{ℓ, m} (x) d W_{ℓ, m} (λ), in L^{2} (Ω),

E [W_{ℓ, m} (A) \overline{W}_{ℓ, m} (B)] = \int_{A \cap B} f_{ℓ} (λ) d λ, for all A, B \subset [- π, π],

E [W_{ℓ, m} (A) \overline{W}_{ℓ, m} (B)] = \int_{A \cap B} f_{ℓ} (λ) d λ, for all A, B \subset [- π, π],

E [a_{ℓ, m} (t) a_{ℓ, m} (t + τ)] = \int_{- π}^{π} exp (iλ τ) f_{ℓ} (λ) d λ .

E [a_{ℓ, m} (t) a_{ℓ, m} (t + τ)] = \int_{- π}^{π} exp (iλ τ) f_{ℓ} (λ) d λ .

{T (x, t), (x, t) \in S^{2} \times Z} and {Z (x, t), (x, t) \in S^{2} \times Z},

{T (x, t), (x, t) \in S^{2} \times Z} and {Z (x, t), (x, t) \in S^{2} \times Z},

Γ_{Z} (x, y) = ℓ = 0 \sum \infty \frac{2 ℓ + 1}{4 π} C_{ℓ; Z} P_{ℓ} (⟨ x, y ⟩), ℓ = 0 \sum \infty \frac{2 ℓ + 1}{4 π} C_{ℓ; Z} < \infty;

Γ_{Z} (x, y) = ℓ = 0 \sum \infty \frac{2 ℓ + 1}{4 π} C_{ℓ; Z} P_{ℓ} (⟨ x, y ⟩), ℓ = 0 \sum \infty \frac{2 ℓ + 1}{4 π} C_{ℓ; Z} < \infty;

(Φ f) (\cdot) = \int_{S^{2}} k (⟨ \cdot, y ⟩) f (y) d y, some k (\cdot) \in L^{2} [- 1, 1] .

(Φ f) (\cdot) = \int_{S^{2}} k (⟨ \cdot, y ⟩) f (y) d y, some k (\cdot) \in L^{2} [- 1, 1] .

k (⟨ x, y ⟩) = ℓ = 0 \sum \infty ϕ_{ℓ} \frac{2 ℓ + 1}{4 π} P_{ℓ} (⟨ x, y ⟩) .

k (⟨ x, y ⟩) = ℓ = 0 \sum \infty ϕ_{ℓ} \frac{2 ℓ + 1}{4 π} P_{ℓ} (⟨ x, y ⟩) .

Φ Y_{ℓ, m} = ϕ_{ℓ} Y_{ℓ, m},

Φ Y_{ℓ, m} = ϕ_{ℓ} Y_{ℓ, m},

T_{t} (x) - (Φ_{1} T_{t - 1}) (x) - \dots - (Φ_{p} T_{t - p}) (x) - Z_{t} (x) = 0,

T_{t} (x) - (Φ_{1} T_{t - 1}) (x) - \dots - (Φ_{p} T_{t - p}) (x) - Z_{t} (x) = 0,

Φ_{j} Y_{ℓ, m} = ϕ_{ℓ; j} Y_{ℓ, m}, and hence k_{j} (⟨ x, y ⟩) = ℓ = 0 \sum \infty ϕ_{ℓ; j} \frac{2 ℓ + 1}{4 π} P_{ℓ} (⟨ x, y ⟩) .

Φ_{j} Y_{ℓ, m} = ϕ_{ℓ; j} Y_{ℓ, m}, and hence k_{j} (⟨ x, y ⟩) = ℓ = 0 \sum \infty ϕ_{ℓ; j} \frac{2 ℓ + 1}{4 π} P_{ℓ} (⟨ x, y ⟩) .

(Φ_{j} T_{t - j}) (x) = ℓ = 0 \sum \infty m = - ℓ \sum ℓ ϕ_{ℓ; j} a_{ℓ, m} (t - j) Y_{ℓ, m} (x),

(Φ_{j} T_{t - j}) (x) = ℓ = 0 \sum \infty m = - ℓ \sum ℓ ϕ_{ℓ; j} a_{ℓ, m} (t - j) Y_{ℓ, m} (x),

a_{ℓ, m} (t) = ϕ_{ℓ; 1} a_{ℓ, m} (t - 1) + \dots + ϕ_{ℓ; p} a_{ℓ, m} (t - p) + a_{ℓ, m; Z} (t);

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Asymptotics for Spherical Functional Autoregressions

Alessia Caponeralabel=e1][email protected] and Domenico Marinuccilabel=e2][email protected] [[ Piazzale Aldo Moro, 5

00185 Rome

Italy

E-mail: [email protected]

Department of Statistical Sciences, Sapienza University of Rome

Via della Ricerca Scientifica, 1

00133 Rome

Italy

E-mail: [email protected]

Department of Mathematics, University of Rome Tor Vergata

Alessia Caponeralabel=e1][email protected] and Domenico Marinuccilabel=e2][email protected] [[ Piazzale Aldo Moro, 5

00185 Roma

Italy

E-mail: [email protected]

Department of Statistical Sciences, Sapienza University of Rome

Via della Ricerca Scientifica, 1

00133 Roma

Italy

E-mail: [email protected]

Department of Mathematics, University of Rome Tor Vergata

Supplement to "Asymptotics for Spherical Functional Autoregressions"

Alessia Caponeralabel=e1][email protected] and Domenico Marinuccilabel=e2][email protected] [[ Piazzale Aldo Moro, 5

00185 Rome

Italy

E-mail: [email protected]

Department of Statistical Sciences, Sapienza University of Rome

Via della Ricerca Scientifica, 1

00133 Rome

Italy

E-mail: [email protected]

Department of Mathematics, University of Rome Tor Vergata

Alessia Caponeralabel=e1][email protected] and Domenico Marinuccilabel=e2][email protected] [[ Piazzale Aldo Moro, 5

00185 Roma

Italy

E-mail: [email protected]

Department of Statistical Sciences, Sapienza University of Rome

Via della Ricerca Scientifica, 1

00133 Roma

Italy

E-mail: [email protected]

Department of Mathematics, University of Rome Tor Vergata

Abstract

In this paper, we investigate a class of spherical functional autoregressive processes, and we discuss the estimation of the corresponding autoregressive kernels. In particular, we first establish a consistency result (in sup and mean-square norm), then a quantitative central limit theorem (in Wasserstein distance), and finally a weak convergence result, under more restrictive regularity conditions. Our results are validated by a small numerical investigation.

62M15,

60G15 60F05 62M40,

Spherical Functional Autoregressions,

Spherical Harmonics,

Quantitative Central Limit Theorem,

Wasserstein Distance,

Weak Convergence,

keywords:

[class=MSC]

keywords:

\setattribute

journalname

\startlocaldefs

\endlocaldefs

t1DM acknowledges the MIUR Excellence Department Project awarded to the Department of Mathematics, University of Rome Tor Vergata, CUP E83C18000100006. We are also grateful to Pierpaolo Brutti for many insightful suggestions and conversations.

1 Introduction

In recent years, a lot of interest has been drawn by the statistical analysis of spherical isotropic random fields. These investigations have been motivated by a wide array of applications arising in many different areas, including in particular, Cosmology, Astrophysics, Geophysics, Climate and Atmospheric Sciences, and many others, see, e.g., [2, 4, 12, 7, 16, 13, 14, 23, 25, 31]. Most papers in Cosmology and Astrophysics have focussed so far on spherical random fields with no temporal dependence; the next generation of Cosmological experiments is however going to make the time dependence much more relevant. On the other hand, applications in Climate, Atmospheric Sciences, Geophysics, and several other areas have always been naturally modelled in terms of a double-dependence in the spatial and temporal domains. In many works of these fields, the attention has been focussed on the definition of wide classes of space-time covariance functions, and then on the derivation of likelihood functions; the literature on these themes is vast and we make no attempt to a complete list of references, see for instance [12, 16, 20, 31] and the references therein.

Our purpose in this paper is to investigate a class of space-time processes, which can be viewed as functional autoregressions taking values in $L^{2}({\mathbb{S}^{2}})$ ; we refer to [5] for a general textbook analysis of functional autoregressions taking values in Hilbert spaces, and [1, 18, 29] for a very partial list of some important recent references.

Dealing with functional spherical autoregressions ensures some very convenient simplifications; in particular, we exploit the analytic properties of the standard orthonormal basis of $L^{2}({\mathbb{S}^{2}})$ and some natural isotropy requirements to obtain neat expressions for the autoregressive functionals, which are then estimated by a form of frequency-domain least squares. For our estimators, we are able to establish rates of consistency (in $L^{2}$ and $L^{\infty}$ norms) and a quantitative version of the Central Limit Theorem, in Wasserstein distance. In particular, we derive explicit bounds for the rate of convergence to the limiting Gaussian distribution by means of the rich machinery of Stein-Malliavin methods (see [28]); to the best of our knowledge, this is the first Quantitative Central Limit Theorem established in the framework of functional-valued stationary processes. Under stronger regularity conditions, we are able to establish a weak convergence result for the kernel estimators; our results are then illustrated by simulations.

The plan of our work is then as follows: in Section 2 we present background results on the harmonic analysis of spherical random fields and on Stein-Malliavin methods. In Section 3 we present our basic model; we show how, under isotropy, the model enjoys a number of symmetry properties which greatly simplify our approach. Our main results are then collected in Section 4, where we investigate rates of convergence and the Quantitative Central Limit Theorem; we consider also weak convergence in $C_{p}\left([-1,1]\right)$ , under stronger regularity conditions for the autoregressive kernels. Large parts of the proofs and many auxiliary lemmas, some of possible independent interest, are collected in Sections 5 and in the Appendix (Supplementary Material). Finally, Section 6 provides numerical estimates on the behaviour of our procedures.

2 Background and Notation

2.1 Spectral Representation of Isotropic Random Fields on the Sphere

Let $\{T(x),\ x\in{\mathbb{S}^{2}}\}$ denote a finite variance, isotropic random field on the unit sphere ${\mathbb{S}^{2}}=\{x\in\mathbb{R}^{3}:\|x\|=1\}$ , by which we mean as usual that $T(g\cdot)\overset{d}{=}T(\cdot),\ \forall g\in SO(3)$ the standard 3-dimensional group of rotations; here the identity in distribution must be understood in the sense of stochastic processes: for notational simplicity, and without loss of generality, we will assume in the sequel that ${\mathbb{E}}[T(x)]=0$ . It is well-known that the following representation holds, in the mean-square sense:

[TABLE]

where $\{Y_{\ell,m}(\cdot),\ \ell\geq 0,\ -\ell\leq m\leq\ell\}$ are the standard basis of spherical harmonics, which satisfy (for $\varphi\in[0,2\pi)$ , $\vartheta\in 0,\pi]$ )

[TABLE]

also $\{a_{\ell,m},\ \ell\geq 0,\ -\ell\leq m\leq\ell\}$ are a triangular array of zero-mean, real-valued random coefficients whose covariance structure is given by

[TABLE]

here $\delta_{a}^{b}$ is the Kronecker delta function, and the sequence $\{C_{\ell},\ \ell\geq 0\}$ represents the angular power spectrum of the field. Throughout this paper we consider the real-valued basis of spherical harmonics, and therefore the random coefficients are real-valued random variables for all $(\ell,m)$ (we refer for instance to [24] for a more detailed discussion on spectral representations on the sphere). Note that the random coefficients $\left\{a_{\ell,m}\right\}$ can be obtained by a direct inversion formula from the map $T(\cdot),$ indeed we have

[TABLE]

Here, we recall also the following addition formula for spherical harmonics (see [24], equation 3.42) which entails that, for any $x,y\in{\mathbb{S}^{2}}$ ,

[TABLE]

where $\langle x,y\rangle$ denotes the standard inner product in $\mathbb{R}^{3}$ , and $P_{\ell}(\cdot)$ represents the $\ell$ -th Legendre polynomial, defined as usual by

[TABLE]

It is easy to show that $P_{\ell}(1)=1;$ moreover, the following reproducing property is satisfied, i.e.,

[TABLE]

Under isotropy, from (1) and (2) the covariance function $\Gamma(x,y)={\mathbb{E}}[T(x)T(y)]$ satisfies

[TABLE]

In the sequel, given any two positive sequences $\{a_{k},\ k\in\mathbb{N}\}$ , $\{b_{k},\ k\in\mathbb{N}\}$ , we shall write $a_{k}\sim b_{k}$ if $\exists c_{1},c_{2}>0$ such that $c_{1}b_{k}\leq a_{k}\leq c_{2}b_{k},\forall k\in\mathbb{N}$ . In addition, we will denote with $const$ a positive real constant, which may change from line to line; also, we use $\|\cdot\|_{L^{2}(\mathbb{S}^{2})}$ for the usual $L^{2}$ norm on the sphere, $\Lambda_{\min}(A)$ and $\Lambda_{\max}(A)$ for the minimum and maximum eigenvalues of the matrix $A$ , respectively, $\|A\|_{\text{op}}$ for the operator norm of $A$ , i.e., $\|A\|_{\text{op}}\ =\sqrt{\lambda_{\max}(A^{\prime}A)}$ , and $\text{Tr}(A)$ for the trace of $A$ .

2.2 Hermite Polynomials and Stein-Malliavin Results

Let us recall the family of Hermite polynomials $\{H_{q}(\cdot),\ q\geq 0\}$ , defined by

[TABLE]

for instance, the first few are given by $H_{1}(x)=x$ , $H_{2}(x)=x^{2}-1$ , $H_{3}(x)=x^{3}-3x$ and $H_{4}(x)=x^{4}-6x^{2}+3$ . The sequence $\{(q!)^{-1/2}H_{q}(\cdot),\ q\geq 0\}$ is an orthonormal basis of the space of finite variance transform of Gaussian variables, i.e.,

[TABLE]

more generally, for the space $L^{2}(\Omega)$ generated by any Gaussian random measure we can write the Stroock-Varadhan decomposition

[TABLE]

where $\mathcal{H}_{q}$ is the $q$ -th order Wiener chaos, i.e., the space spanned by linear combinations of $q$ -th order Hermite polynomials, see [28] for more discussions and details.

We shall exploit extensively a very powerful technique, recently discovered by [27], to establish Quantitative Central Limit Theorems for sequences of random variables belonging to Wiener chaoses. To explain what we mean by a Quantitative Central Limit Theorem, we recall first the notion of Wasserstein distance, i.e., for any two $d$ -dimensional random variables $X,Y,$

[TABLE]

with $\|\cdot\|$ the usual Euclidean norm on $\mathbb{R}^{d}$ , where we assume that ${\mathbb{E}}|h(X)|<\infty$ , ${\mathbb{E}}|h(Y)|<\infty$ for every $h(\cdot)$ . See [28] for a discussion of the main properties of $d_{W}$ and for other examples of probability metrics; here we recall simply that

[TABLE]

It is shown in [28] that for sequences of zero-mean scalar random variables $\left\{Z_{k},\ k\in\mathbb{N}\right\}$ belonging to $\mathcal{H}_{q}$ ( $q\geq 2$ ) such that ${\mathbb{E}}[Z_{k}]=\sigma^{2}>0$ , one has the remarkable inequality

[TABLE]

where $Z\overset{d}{=}\mathcal{N}(0,\sigma^{2})$ (in our proof below we will actually exploit a multivariate extension of this inequality, also given in [28]). The inequality in (4) can be proved by means of the so-called Stein-Malliavin approach, which establishes a deep and surprising connection between Malliavin calculus and Stein’s equation as a tool for the investigation of limiting distributions. In particular, in view of (4) for sequences that belong to Wiener chaoses the investigation of the asymptotic behaviour of the fourth-moment is enough to investigate not only the validity of a central limit theorem, but also the rate of convergence to the Gaussian limiting distribution.

3 Spherical Random Fields with Temporal Dependence

We are now ready to introduce our model of interest. As usual, by space-time spherical random fields we mean a collection of random variables $\{T(x,t),\ (x,t)\in{\mathbb{S}^{2}}\times\mathbb{Z}\}$ such that the application $T:\Omega\times{\mathbb{S}^{2}}\times\mathbb{Z}\rightarrow\mathbb{R}$ is $\Im\otimes\mathcal{B}({\mathbb{S}^{2}}\times\mathbb{Z})$ -measurable, for some probability space $(\Omega,\Im,\mathbb{P})$ . The following definition is standard:

Definition 1.

$\{T(x,t),\ (x,t)\in{\mathbb{S}^{2}}\times\mathbb{Z}\}$ * is 2-weakly isotropic stationary if ${\mathbb{E}}[T(x,t)]$ is constant $\forall(x,t)\in{\mathbb{S}^{2}}\times\mathbb{Z}$ and the covariance function $\Gamma$ is a spatially isotropic and temporally stationary function on $({\mathbb{S}^{2}}\times\mathbb{Z})^{2}$ , that is there exists $\Gamma_{0}:[-1,1]\times\mathbb{Z}\rightarrow\mathbb{R}$ such that*

[TABLE]

In particular, we will focus on Gaussian random fields, where of course weak stationarity entails strong isotropy and stationarity, i.e., the law of $T(g\cdot,\cdot+\tau)$ is the same as the law of $T(\cdot,\cdot),$ in the sense of processes, for all $g\in SO(3)$ and $\tau\in\mathbb{Z}$ . Note that, for (zero-mean) finite variance random fields, $T_{t}(\cdot)\equiv T(\cdot,t)$ is a random function of $L^{2}({\mathbb{S}^{2}})$ , $t\in\mathbb{Z}$ . Thus, for any fixed $t\in\mathbb{Z}$ , the following spectral representation holds:

[TABLE]

where $\{Y_{\ell,m}(\cdot),\ \ell\geq 0,\ -\ell\leq m\leq\ell\}$ are spherical harmonics, and $\{a_{\ell,m}(t),\ \ell\geq 0,\ -\ell\leq m\leq\ell\}$ (zero-mean) random coefficients which satisfy

[TABLE]

Note that $\{C_{\ell}(0),\,\ell\geq 0\}$ corresponds to the angular power spectrum of the spherical field at a given time point, for which we will simply write $\{C_{\ell}\}$ . As for the isotropic case, for fixed $t,s\in\mathbb{Z}$ , the covariance function $\Gamma(x,t,y,s)$ is easily shown to have a spectral decomposition in terms of Legendre polynomials (Schoenberg’s Theorem, see also [4]), i.e., for every $(x,t),(y,s)\in{\mathbb{S}^{2}}\times\mathbb{Z}$ ,

[TABLE]

Remark 2.

By exploiting results from [29], it would also be possible to rewrite (5) by means of the Cramér-Karhunen-Loéve representation

[TABLE]

where $\left\{W_{\ell,m}(\cdot)\right\}$ is a family of independent complex-valued Gaussian random measures, with mean zero and covariance structure

[TABLE]

where $f_{\ell}(\cdot)$ denotes the spectral density of the process $\left\{a_{\ell,m}(t),\ t\in\mathbb{Z}\right\},$ which is introduced below and satisfies

[TABLE]

This approach is not pursued here, see also [8] for more discussion and details.

3.1 Spherical Autoregressions

Consider now two zero-mean Gaussian isotropic stationary random fields

[TABLE]

so that ${\mathbb{E}}[T^{2}(x,t)]<\infty$ and ${\mathbb{E}}[Z^{2}(x,t)]<\infty$ . Let us start from the definition of a Gaussian spherical white noise process.

Definition 3 (Gaussian Spherical White Noise).

$\{Z(x,t),\ (x,t)\in{\mathbb{S}^{2}}\times\mathbb{Z}\}$ * is a sequence of independent and identically distributed Gaussian isotropic spherical random fields. That is*

a) for every fixed $t\in\mathbb{Z}$ , $Z(\cdot,t)$ is a Gaussian, zero-mean isotropic random field, with covariance function

[TABLE]

here, $\left\{C_{\ell;Z}\right\}$ denotes as usual the angular power spectrum of $Z(\cdot,t)$ ;

b) for every $t\neq s,$ the random fields $Z(\cdot,t)$ are independent.

Remark 4.

Note that we are defining the field as a collection of random variables defined on every pair $(x,t)\in S^{2}\times\mathbb{Z}.$ Alternatively, one could view the fields as random elements in a Hilbert space (in our case, corresponding to $L^{2}(\mathbb{S}^{2}),$ see [5] $);$ the two approaches are equivalent here, because throughout this paper we will always be dealing with mean-square continuous random fields.

Definition 5.

A spherical isotropic kernel operator is an application $\Phi:L^{2}(S^{2})\rightarrow L^{2}(S^{2})$ which satisfies

[TABLE]

The following representation holds, in the $L^{2}$ sense, for the kernel associated to $\Phi$ :

[TABLE]

The coefficients $\{\phi_{\ell},\ \ell\geq 0\}$ corresponds to the eigenvalues of the operator $\Phi$ and the associated eigenfunctions are the family of spherical harmonics $\left\{Y_{\ell,m}\right\}$ , yielding

[TABLE]

Thus, it holds $\sum_{\ell}(2\ell+1)\phi_{\ell}^{2}<\infty$ , and hence this operator is Hilbert-Schmidt (see, e.g., [19]). In this paper, we shall also consider trace class operators, namely such that $\sum_{\ell}(2\ell+1)|\phi_{\ell}|<\infty$ , for which the representation (6) holds pointwise for every $x,y\in{\mathbb{S}^{2}}$ .

Definition 6.

The collection of random variables $\left\{T(x,t),(x,t)\in\mathbb{S}^{2}\times\mathbb{Z}\right\}$ satisfies the Spherical Autoregressive process of order $p$ (written $SPHAR(p)$ ) if there exist $p$ isotropic kernel operators $\left\{\Phi_{1},\dots,\Phi_{p}\right\}$ such that

[TABLE]

for all $(x,t)\in\mathbb{S}^{2}\times\mathbb{Z}$ , the equality holding both in the $L^{2}(\Omega)$ and in the $L^{2}(\Omega\times\mathbb{S}^{2})$ sense.

Remark 7.

It should be noted that the solution process is defined pointwise, i.e., for each $(x,t)$ there exists a random variable defined on ( $\Omega,\Im,\mathbb{P}$ ) such that the identity (7) holds.

Let us define the eigenvalues $\left\{\phi_{\ell;j},\ \ell\geq 0,\ j=1,\dots,p\right\}$ which satisfy

[TABLE]

Hence, for any $t\in\mathbb{Z}$ ,

[TABLE]

that is $(\Phi_{j}T_{t-j})(\cdot)$ admits a spectral representation in terms of spherical harmonics with coefficients $\{\phi_{\ell;j}a_{\ell m}(t-j),\ \ell\geq 0,\ -\ell\leq m\leq\ell\}$ . Likewise, we obtain

[TABLE]

to ensure identifiability, we assume that there exists at least an $\ell$ such that $\phi_{\ell;p}\neq 0,$ so that $\Pr\{(\Phi_{p}T_{t})(\cdot)\neq 0\}>0,\ t\in\mathbb{Z},$ see again [5]. Let us now define as usual the associated polynomials $\phi_{\ell}:\mathbb{C\rightarrow C},\ \ell\geq 0$ :

[TABLE]

Condition 8.

The sequence of polynomials (9) is such that $|z|\leq 1+\delta\ \Rightarrow\ \phi_{\ell}(z)\neq 0$ , some $\delta>0$ . More explicitly, there are no roots in a $\delta$ -enlargement of the unit disk, for all $\ell\geq 0$ .

Remark 9.

Under Condition 8, Equation (7) admits a unique stationary isotropic solution; the proof can be given along the same lines as in [5], and it is omitted for brevity’s sake, see [8] for more discussion and details.

Example 10 ( $SPHAR(1)$ ).

The family of random variables $\{T(x,t),\ (x,t)\in S^{2}\times\mathbb{Z}\}$ is a spherical $AR(1)$ process if for all pairs $(x,t)\in S^{2}\times Z$ it satisfies

[TABLE]

In this case, the Condition 8 simply becomes $|\phi_{\ell}|<\frac{1}{1+\delta}$ , $\ell\geq 0$ .

Remark 11.

The autocovariance function of a stationary spherical $AR(1)$ process is easily seen to be given by (writing $\tau=t-s)$

[TABLE]

It is easy hence to envisage a number of parametric models for sphere-time covariances; for instance, a simple proposal is

[TABLE]

Here, the parameters $\alpha_{Z},\alpha_{\phi}$ control, respectively, the smoothness of the innovation process and the regularity of the autoregressive kernel (see [22]); the positive integer $\ell^{\ast}$ can be seen as a sort of "characteristic scale", where the power of the kernel is concentrated. More generally, we can take $\phi_{\ell}=G(\ell;\alpha_{1},\dots,\alpha_{q}),$ where $\alpha_{1},\dots,\alpha_{q}$ are fixed parameters and $G$ is any function such that

[TABLE]

uniformly over all values of $(\alpha_{1},\dots,\alpha_{q}).$

Condition 12 (Identifiability).

The Gaussian spherical white noise process $\left\{Z(x,t)\right\}$ is such that $C_{\ell;Z}>0$ for all $\ell=0,1,2,\dots.$

Remark 13.

The previous condition is an identifiability assumption; indeed, it is simple to verify from our arguments below that for $C_{\ell;Z}=0$ the component of the kernel corresponding to the $\ell$ -th multipole is not observable, i.e., the $AR(p)$ process has the same distribution whatever the value of $\phi_{\ell}.$ It is possible, however, to estimate the "sufficient" version of the kernel, i.e., its projection on the relevant subspace, such that $C_{\ell,Z}>0.$ The extension is straightforward and we avoid it just for brevity and notational simplicity. Of course, as a consequence we have that

[TABLE]

4 Main Results

Throughout this paper, we shall assume to be able to observe the projections of the fields on the orthonormal basis $\left\{Y_{\ell m}\right\},$ i.e., we assume to observe

[TABLE]

The estimator we shall focus on is a form of least square regression on an increasing subset of the orthonormal system $\left\{Y_{\ell,m}\right\};$ more precisely, we shall define $k(\cdot):=(k_{1}(\cdot),\cdots,k_{p}(\cdot))^{\prime}$ for the vector of nuclear kernels, a growing sequence of integers $L_{N},$ $L_{N}\rightarrow\infty$ as $N\rightarrow\infty;$ and a vector of estimators

[TABLE]

where $N:=n-p$ , $N>p$ , and $\mathcal{P}_{N}^{p}$ is the Cartesian product of $p$ copies of

[TABLE]

As common in the autoregressive context, we drop the first $p$ observations when computing our estimators, in order to avoid initialization issues. We shall write $L_{N}(\cdot)$ for the function $L_{N}(\cdot):[-1,1]\rightarrow\mathbb{R}$ ,

[TABLE]

Note that

[TABLE]

on the other hand, for $z\in(-1,1)$ we have the identity (see [33, 17])

[TABLE]

it is then possible to show that (see Lemma 4 in the Supplementary Material)

[TABLE]

where $\simeq$ indicates that the ratio of left- and right-hand sides converges to unity.

For our results to follow, we need slightly stronger assumptions on the "high frequency" behaviour of the kernels $k_{j}(\cdot).$ More precisely, we shall introduce the following:

Condition 14 (Smoothness).

For all $j=1,\dots,p$ there exists positive constants $\beta_{j},\gamma_{j}$ such that

[TABLE]

We let $\beta_{\ast}=\min_{j\in\{1,\dots,p\}}\beta_{j}$ . We shall say that this condition is satisfied in the strong sense if $\beta_{j}>2,\ j=1,\dots,p$ .

Remark 15.

It is readily seen that Condition 14 leads to Hilbert-Schmidt operators, since it implies $\sum_{\ell}(2\ell+1)\phi_{\ell;j}^{2}<\infty$ , $j=1,\dots,p$ ; whereas the strong version Condition 14 is specific for nuclear operators, since it entails $\sum_{\ell}(2\ell+1)|\phi_{\ell;j}|<\infty$ , $j=1,\dots,p$ , see again [19].

Remark 16.

Condition 14 is easily interpretable in terms of the regularity of each kernel $k_{j}(\cdot)$ . Indeed, in [22] it is shown that

[TABLE]

implies integrability of the first $\eta$ derivatives of $k_{j}(\cdot),$ i.e., $k_{j}(\cdot)$ belongs to the Sobolev space $W_{1,\eta}$ .

Our first result refers to the asymptotic consistency of the kernel estimators that we just introduced.

Theorem 17 (Consistency).

Consider $\widehat{{k}}_{N}(\cdot)$ in Equation (11). Under Conditions 8, 12 and 14, for $L_{N}\sim N^{d},\ 0<d<1$ , we have that

[TABLE]

Moreover, under Conditions 8, 12 and 14 (in the strong sense), for $L_{N}\sim N^{d},\ 0<d<\frac{1}{3}$ ,

[TABLE]

Remark 18 (Optimal choice of $d$ ).

The optimal choice of $d,$ in terms of the best convergence rates, is given by $d^{\ast}=\frac{1}{2\beta_{\ast}-1},$ leading to the exponents $\frac{2-2\beta_{\ast}}{2\beta_{\ast}-1}$ and $\frac{2-\beta_{\ast}}{2\beta_{\ast}-1},$ respectively. Heuristically, the result can be explained as follows: larger values of $\beta_{\ast}$ entail higher regularity/smoothness properties of the kernels to be estimated; as usual in nonparametric estimation, more regular functions can be estimated with better convergence rates, as the bias term is controlled more efficiently. Indeed, for $d=d^{\ast}$ and $\beta_{\ast}\rightarrow\infty$ , the mean square error approximates the parametric rate $1/N$ , as expected.

Remark 19 (Plug-in estimates).

For applications to empirical data, the optimal rate can be implemented by means of plug-in techniques, i.e., estimating (under additional regularity conditions) the value of the parameter $\beta_{\ast}$ by means of first step-estimators of the coefficients $\phi_{\ell,j}.$ Let us sketch the main ideas for this approach, omitting some details for brevity. Consider for simplicity the $SPHAR(1)$ case, and let us make Condition 14 stronger by assuming that

[TABLE]

Consider the estimator

[TABLE]

from which we can now build the pseudo log-regression model

[TABLE]

where the "regression residuals" $\left\{v_{\ell}\right\}$ are independent over $\ell$ , with asymptotically mean zero and bounded variance as $N\rightarrow\infty.$ It is then possible to study the asymptotic consistency of the OLS-like estimator (see also [32] for the related log-periodogram estimator)

[TABLE]

The optimal rates can then be consistently estimated by means of the plug-in estimates $\widehat{d}_{N}^{\ast}=\frac{1}{2\widehat{\beta}_{N}-1}.$

A more rigorous and complete investigation on these issues is currently in preparation and is not reported here for brevity’s sake.

Our second result refers to a Quantitative Central Limit Theorem for the kernel estimators. Consider $\widehat{{k}}_{N}(\cdot)$ in Equation (11) and, for any $m\in\mathbb{N}$ , any $z_{1},\dots,z_{m}\in(-1,1)$ , $z_{1}\neq\cdots\neq z_{m}$ , define the $mp\times 1$ vectors

[TABLE]

Theorem 20.

Under Conditions 8, 12 and 14 (in the strong sense), for $L_{N}\sim N^{d},\ d>\frac{1}{2\beta_{\ast}-2}$ , we have that

[TABLE]

Remark 21.

It is easy to see that the bound in Theorem 20 can also be expressed as

[TABLE]

An immediate Corollary is the following.

Corollary 22.

Under the same Conditions and notation as in Theorem 20, for any fixed $z\in[-1,1]$ , we have that

[TABLE]

Remark 23.

Plug-in procedures can be exploited to determine the choice of the "bandwidth" parameter $d$ which yields the optimal convergence rate in Wasserstein distance. As usual, the values of $d$ that guarantee asymptotic normality do not minimize the mean squared error; in fact, we have that $d^{\ast}=\frac{1}{2\beta_{\ast}-1}<\frac{1}{2\beta_{\ast}-2},$ which is the minimal value of $d$ for Theorem 20 to hold. Indeed, asymptotic Gaussianity requires undersmoothing, i.e., a value of $d$ which makes the asymptotic bias negligible, rather than of the same order as the variance. Once again the rate can be taken to approach $N^{-1/2}$ for $\beta_{\ast}\rightarrow\infty.$

For our third and final result, we need to strengthen the conditions on the regularity of the autoregressive kernels.

Condition 24.

The kernel $k_{j}(\cdot)$ admits a final expansion in the Legendre basis, i.e., there exist an (arbitrary large but finite) integer $L>0$ such that

[TABLE]

Condition 24 clearly implies that there exist finite integers $L_{1},\dots,L_{p}\leq L$ such that

[TABLE]

we also need to introduce, for $\ell=0,1,2,\dots,$ the $p\times p$ autocovariance matrix

[TABLE]

and we shall write $W_{p}(\cdot)$ for the zero-mean, $p$ -dimensional Gaussian process with covariance function

[TABLE]

We are now able to state our last Theorem.

Theorem 25.

Under Conditions 8, 12 and 24, we have that

[TABLE]

where $\Longrightarrow$ denotes weak convergence in $C_{p}[-1,1]$ (the space of continuous functions from $[-1,1]$ to $\mathbb{R}^{p},$ with the standard uniform metric).

Remark 26.

At first sight, it may look surprising that the weak convergence for the estimators in Theorem 25 occurs at a faster rate $\sqrt{N}$ than the convergence in finite-dimensional distributions of Theorem 20. This comparison, however, is misleading; indeed, in Theorem 20 we are not assuming the expansion of the kernels to be finite, and therefore we need to include a growing number of multipoles $L_{N},$ to ensure that bias terms are asymptotically negligible. On the other hand, note that weak convergence cannot hold under the conditions of Theorem 20, as the limiting finite dimensional distributions correspond to Gaussian independent random variables for any choice of fixed points $(z_{1},\dots,z_{m}):$ no Gaussian process with measurable trajectories can have these finite-dimensional distributions. The limiting distribution is characterized by the nuisance parameters $\left\{C_{\ell},C_{\ell}(1),\dots,C_{\ell}(p-1),C_{\ell;Z}\right\};$ for brevity’s sake, estimation of these parameters is deferred to future work.

5 Proofs of the Main Results

We now present the main arguments of our proofs, which are based on a number of technical results collected in the Appendix (Supplementary Material). For $\ell=0,1,2,\dots,$ it is convenient to introduce the $N(2\ell+1)$ -dimensional vectors

[TABLE]

moreover, let us consider the $N(2\ell+1)\times p$ matrix

[TABLE]

where

[TABLE]

We start from the proof of the consistency results.

Proof (Theorem 17).

It is easy to see that we have

[TABLE]

where

[TABLE]

Now, let ${r}_{N}(z)$ be the difference between the kernel and its truncated version

[TABLE]

i.e.,

[TABLE]

where the equality holds in the $L^{2}$ sense. Then,

[TABLE]

since ${\mathbb{E}}\left[\int_{-1}^{1}\left\langle\widehat{{k}}_{N}(z)-{k}_{N}(z),{r}_{N}(z)\right\rangle dz\right]=0$ , from orthogonality of Legendre polynomials.

Now notice that

[TABLE]

Then, from Lemma 2 in the Supplementary material,

[TABLE]

On the other hand,

[TABLE]

Therefore, under Condition 14 and for $L_{N}\sim N^{d},\ 0<d<1$ , we have

[TABLE]

and

[TABLE]

where $\beta_{\ast}=\min_{j\in\{1,\dots,p\}}\beta_{j}$ , as claimed.

Under the strong version of Condition 14, each kernel $k_{j}(\cdot)$ is defined for all $z\in[-1,1]$ as the pointwise limit of its expansion in terms of Legendre polynomials and

[TABLE]

by the triangle inequality. Hence, for the first component we have

[TABLE]

again in view of Lemma 2 in the Appendix (Supplementary Material)and the Cauchy-Schwartz inequality. On the other hand

[TABLE]

Therefore, again under the strong version of Condition 14 and for $L_{N}\sim N^{d},\ 0<d<\frac{1}{3}$ , we have

[TABLE]

and thus

[TABLE]

as claimed.

We are now in the position to establish the Quantitative Central Limit Theorem.

Proof (Theorem 20).

Let us recall that the minimizing estimator takes the form

[TABLE]

where

[TABLE]

We shall introduce some more notation:

[TABLE]

and

[TABLE]

Therefore

[TABLE]

Heuristically, the proof of the Quantitative Central Limit Theorem can be described as follows: in order to be able to exploit Stein-Malliavin techniques, we need to deal with variables belonging to some $q$ -th order chaos; now the ratio above does not fulfill this requirement, because $A_{\ell;N}^{-1}$ is a random quantity which does not belong to any $\mathcal{H}_{q}$ . On the other hand, componentwise we have ${B}_{\ell;N}\in\mathcal{H}_{2}$ , for each $\ell$ . We shall then show that it is possible to replace $A_{\ell;N}^{-1}$ by its (deterministic) probability limit $\Sigma_{\ell}^{-1}$ , without affecting asymptotic results; because our kernel estimators will be written as linear combinations of $\widehat{\boldsymbol{\phi}}_{\ell;N}$ , the proof can be completed by a careful investigation of multivariate fourth-order cumulants.

Let us now make the previous argument rigorous. Let ${K}_{N}$ and ${U}_{N}$ be two $mp$ -dimensional random vectors, defined as

[TABLE]

and

[TABLE]

In particular, ${\mathbb{E}}[{U_{N}}]={0}_{mp}$ and ${\mathbb{E}}[{U}_{N}{U}_{N}^{\prime}]=V_{N}$ , where $V_{N}$ is a block matrix whose generic $ij$ -th block, $i,j\in\{1,\dots,m\}$ , is given by

[TABLE]

Now, consider ${Z}\overset{d}{=}\mathcal{N}_{mp}({0}_{mp},I_{mp})$ and ${Z}_{N}\overset{d}{=}\mathcal{N}_{mp}({0}_{mp},V_{N})$ . Applying the triangle inequality twice, it follows that

[TABLE]

We recall from [28], p. 126, Equation 6.4.2 that

[TABLE]

where $\|A\|_{\text{HS}}=\sqrt{\text{Tr}(A^{\prime}A)}$ , and we observe that

[TABLE]

from Lemmas 3 and 4 in the Supplementary Material. Indeed, for every $i\in\{1,\dots,m\}$ ,

[TABLE]

the logarithmic term comes from Equation (8) in the Supplementary Lemma 4. Equation (17) entails that $V_{N}\rightarrow I_{mp}$ , thus we have $\|V_{N}^{-1}\|_{\text{op}}\|V_{N}\|_{\text{op}}^{1/2}\rightarrow 1$ , as $N\rightarrow\infty$ , and

[TABLE]

Let us recalll again from [28], p. 122 (second point of Theorem 6.2.2) that

[TABLE]

where

[TABLE]

$\tilde{b}_{\ell;N}(j)$ being the $j$ -th element of $\Sigma_{\ell}^{-1}{B}_{\ell;N}$ . Moreover, for the $j$ -th element of $\Sigma_{\ell}^{-1}{B}_{\ell;N}$ we have

[TABLE]

see Equation (4) in Lemma 1. In addition,

[TABLE]

in view of the independence across different multipoles $\ell$ . Therefore

[TABLE]

Thus, we have

[TABLE]

and

[TABLE]

Now, consider the decomposition

[TABLE]

Without loss of generality, we shall focus on the case $m=1$ ; the more general argument is basically identical, with a slightly more cumbersome notation. For $z\in(-1,1)$ ,

[TABLE]

and then, also using Hilb’s equation,

[TABLE]

where for the second inequality we have exploited the Supplementary Lemma 2. Likewise,

[TABLE]

From Equations (5) and (5),

[TABLE]

In the end, combining Equations (18), (19) and (22), it holds that

[TABLE]

Note that the constant in this bound may depend on the choice of $m$ and $z_{1},\dots,z_{m}$ .

We can now give the proof of the third (and final) result.

Proof (Theorem 25).

We have that, for $z\in[-1,1]$ ,

[TABLE]

Then,

[TABLE]

and hence

[TABLE]

in view of the Supplementary Lemma 2. Then the second part of the sum in (5) goes to zero in probability. Since the sum (over $\ell$ ) has independent components, we just need to prove that, for each $\ell=0,1,2,\dots,L$ , $\{{B}_{\ell;N}P_{\ell}(\cdot),\ N>1\}$ forms a tight sequence. Using the tightness criterion given in [3], Equation 13.14 on page 143, it is sufficient to show that, for $z_{1}\leq z\leq z_{2}$ ,

[TABLE]

Convergence of the finite-dimensional distributions is standard and we omit the details, which are closed to those given in the proofs of the previous Theorem. Thus the sequence converges weakly to a zero-mean multivariate Gaussian process with covariance function $\Gamma_{{k}_{L}}(z,z^{\prime})=\sum_{\ell=0}^{L}C_{\ell;Z}\Gamma_{\ell}^{-1}\frac{2\ell+1}{16\pi^{2}}P_{\ell}(z)P_{\ell}(z^{\prime})$ .

6 Some Numerical Evidence

In this section, we present some short numerical results to illustrate the models and methods that we discussed in this paper.

We stress first that random fields on the sphere cross time can be very conveniently generated by combining the general features of Python with the HEALPix software (see [15] and https://healpix.sourceforge.io). More precisely, HEALPix (which stands for Hierarchical Equal Area and iso-Latitude Pixelation) is a multi-purpose computer software package for a high resolution numerical analysis of functions on the sphere, based on a clever tessellation scheme: the spherical surface is hierarchically partitioned into curvilinear quadrilaterals of equal area (at a given resolution), distributed on lines of constant latitude, as suggested in the name. In particular, we shall make use of healpy, which is a Python package based on the HEALPix C++ library. HEALPix was developed to efficiently process Cosmic Microwave Background data from Cosmology experiments (like Planck, [30]), but it is now used in many other branches of Astrophysics and applied sciences.

In short, HEALPix allows to create spherical maps according to the spectral representation (1), accepting in input either an array of random coefficients $\left\{a_{\ell,m}\right\}$ , or the angular power spectrum $\left\{C_{\ell}\right\}$ , by means of the routines alm2map and synfast: in the latter case, random $\left\{a_{\ell,m}\right\}$ are generated according to a Gaussian zero mean distribution with variance $\left\{C_{\ell}\right\}$ . The routine is extremely efficient and allows to generate maps of resolution up to a few thousands multipoles in a matter of seconds on a standard laptop computer.

In our case, however, we need random fields where the random spherical harmonics coefficients have themselves a temporal dependence structure. For this reason, we implemented a simple routine in Python, to simulate Gaussian $\left\{a_{\ell,m}(t)\right\}$ processes, each following an $AR(p)$ dependence structure. These random harmonic coefficients are then uploaded into HEALPix, to generate maps such as those that are given in Figure 1. In particular, in these two cases we fixed $L_{\max}=\max(\ell)=30,200$ , respectively. Then we generated $\left\{a_{\ell,m}(t)\right\}$ according to a stationary $AR(1)$ processes, with parameters $\phi_{\ell}\simeq const\times\ell^{-3}$ ; similarly, we took here $C_{\ell;Z}\simeq const\times\ell^{-2}$ . In the figure, we report the realization for the first 4 periods, simply for illustrative purposes.

We are now in the position to use simulations to validate the previous results. In our first Tables 1-3, we report for $B=1000$ Monte Carlo replications the values of the "variance" and "bias" terms, i.e., the first and second summand in the mean square equation (16); the second term is actually deterministic, and it is reported to illustrate the approximation one obtains by cutting the expansion to a finite multipole value. In the third column, we report, the actual (squared) $L^{2}$ error. On the left-hand side, we fix the number of multipoles to be exploited in the reconstruction of the kernel; on the right-hand side, we consider a sort of "oracle" estimator, where the number of multipoles grows with the optimal rate $N^{\frac{1}{2\beta_{\ast}-1}}$ . As before, we took $C_{\ell;Z}\simeq const\times\ell^{-2}$ , $\phi_{\ell}\simeq const\times\ell^{-\beta}$ for $\beta=2,2.5,3$ ; for $N=100,300,700$ the left -hand side uses $L_{N}\sim N^{0.6}$ , while the right-hand side takes $L_{N}\sim N^{\frac{1}{2\beta_{\ast}-1}}$ , as explained above.

We note how the estimators perform very efficiently, and show the errors scale approximately as $N^{\alpha}$ , where ${\alpha}\approx\frac{2\beta_{\ast}-2}{2\beta_{\ast}-1}$ , as predicted by our computations, see Remark 18. In particular, for $\beta_{\ast}=2$ our results predict an upper bound for the $L^{2}$ error decaying as $N^{-0.67},$ whereas simulations show a decay in the order of $N^{-0.66};$ for $\beta_{\ast}=2.5$ we have $N^{-0.75}$ and $N^{-0.82},$ and finally for $\beta_{\ast}=3$ the predicted upper bound is in the order of $N^{-0.80}$ , while the observed decay is in the order of $N^{-0.92}.$

[TABLE]

We can now focus quickly on the main result of our paper, dealing with the Quantitative Central Limit Theorem, in Wasserstein distance; the latter is computed following the Python routine (scipy.stats.wasserstein distance). We consider again a model where the autoregressive parameter and the angular power spectra are exactly the same as in the previous settings, in particular taking $\beta=3$ and $d=0.3,0.4$ ; we fix $L_{\max}=100$ for the number of components under the null hypothesis. Under these circumstances, we evaluate (univariate) Wasserstein distances for the kernel estimators at $m=9$ different locations, performing $B=1000$ Monte Carlo replications.

The results are reported in Table 4; here we take $B=1000$ Monte Carlo replications, and taking $N=10^{2},10^{3},10^{4}$ for the cardinality of the time-domain observations. It should be noted that huge sample sizes are quite common when dealing with sphere cross time data, see, e.g., the NCEP/NCAR reanalysis datasets [21] for athmospheric research.

Again, we note as simulations track closely the theoretical predictions. More precisely, for the theoretical upper bounds we expect $d_{W}$ to decay as $N^{\frac{1}{2}+d(1-\beta_{*})}$ , leading to $N^{-0.1}$ in the setting of Table 4, $N^{-0.3}$ for Table 5, whereas estimates from the simulations give as worst rates $N^{-0.18}$ and $N^{-0.3}$ , respectively.

[TABLE]

Clearly a full assessment of these procedures would require a much deeper numerical investigation; these preliminary results, however, seem rather encouraging for future developments.

t1DM acknowledges the MIUR Excellence Department Project awarded to the Department of Mathematics, University of Rome Tor Vergata, CUP E83C18000100006. We are also grateful to Pierpaolo Brutti for many insightful suggestions and conversations.

1 Appendix

Throught this Appendix, we assume that Conditions 8 and 13 hold. Under these assumptions the proof that Equation (7) admits a unique stationary and isotropic solution can be given along the same lines as in [5] and it is omitted for brevity’s sake; see [8] for more discussion and details. Note that, under these two Conditions, the variance $C_{\ell}$ can be written in terms of the coefficients $\phi_{\ell;j}$ , $j=1,\dots,p$ , the autocorrelations $\rho_{\ell}(j)=C_{\ell}(j)/C_{\ell}$ , $j=1,\dots,p$ , and the error variance $C_{\ell;Z}$ ; namely

[TABLE]

Hence,

[TABLE]

and there exists a positive constant $\phi^{\ast}$ such that, uniformly over $\ell$ ,

[TABLE]

Recall that $C_{\ell;Z}/C_{\ell}$ and $C_{\ell}/C_{\ell;Z}$ are (in absolute value) bounded by positive constants since both converge to 1 as $\ell\rightarrow\infty$ . Now, we denote with $g_{\ell}(\lambda)$ the correlation spectral density

[TABLE]

where $\rho_{\ell}(\cdot):=C_{\ell}(\cdot)/C_{\ell}$ is the autocorrelation function, and we recall that $\Sigma_{\ell}$ is the $p\times p$ matrix of autocorrelations, with $ij$ -th element $\rho_{\ell}(i-j)$ . Since $g_{\ell}(\cdot)$ is a continuous symmetric function on $[-\pi,\pi]$ , it follows that (see [34])

[TABLE]

where $\underline{g_{\ell}}$ and $\overline{g_{\ell}}$ are the minimum and maximum of $g_{\ell}(\cdot)$ in $[-\pi,\pi]$ , respectively; $\lambda_{\min}(\Sigma_{\ell})$ and $\lambda_{\max}(\Sigma_{\ell})$ are the minimum and maximum eigenvalues of $\Sigma_{\ell}$ , respectively. Moreover, because we assumed $g_{\ell}(\lambda)>0,\ \forall\lambda\in[-\pi,\pi]$ , from (2) we conclude that the minimum eigenvalue is strictly positive (and hence bounded away from zero) and $\Sigma_{\ell}$ is positive definite (and then invertible). Since $\Sigma_{\ell}$ is a $p\times p$ real symmetric positive definite matrix, then

[TABLE]

where $\|A\|_{\text{op}}\ =\sqrt{\lambda_{\max}(A^{\prime}A)}$ , and $\text{Tr}(A)$ is the trace of $A$ . In addition,

[TABLE]

since

[TABLE]

Then, from Equation (1), we can conclude that, uniformly over $\ell$ ,

[TABLE]

The first result below will be exploited to prove convergence in probability of the denominator for our estimators, while the second part gives the fourth-cumulant bound which is crucial for Stein-Malliavin arguments. We recall here for convenience the equalities

[TABLE]

where

[TABLE]

and

[TABLE]

Lemma 1.

For any integers $\ell\geq 0$ , $N>p$ , there exists $M>0$ such that

[TABLE]

and

[TABLE]

where $\widetilde{b}_{\ell;N}(i)=\sum s_{\ell}(i,j)b_{\ell;N}(j)$ is the $i$ -th element of the $p$ -dimensional vector $\widetilde{B}_{\ell;N}=\Sigma_{\ell}^{-1}{B}_{\ell;N},$ and $s_{\ell}(i,j)$ are the elements of the inverse matrix $\Sigma_{\ell}^{-1}$ .

The following result shows that replacing $A_{\ell;N}$ by its expected value $\Sigma_{\ell}$ in the definition of the OLS-like estimator $\widehat{\phi}_{\ell;N}$ does not have any asymptotic effect, as $N\rightarrow\infty$ .

Lemma 2.

For any integers $\ell\geq 0$ and $N>7+p$ , there exists generic positive constants such that

[TABLE]

and

[TABLE]

The following results entail that $\lim_{N\rightarrow\infty}V_{N}=I_{mp}$ ; actually the Propositions below give also a uniform rate of convergence.

Lemma 3.

If $\|\boldsymbol{\phi}_{\ell}\|\leq\frac{\gamma}{\ell^{\beta}},\ \beta>1,\ \ell>0$ ,

[TABLE]

The next result is technical; given the huge amount of work which has taken place on Legendre polynomials, we expect that the statement could be known already, but we failed to locate a reference and therefore we report a full proof for the sake of completeness.

Lemma 4.

Let $z=\cos\theta,\ \theta\in(0,\pi)$ ,

[TABLE]

on the other hand, for $\theta,\theta^{\prime}\in(0,\pi),\ \theta\neq\theta^{\prime},$ as $L\rightarrow\infty$ ,

[TABLE]

We can now start with the proof of these Lemmas.

Proof (Lemma 1).

Let us start by observing that the $ij$ -th element of $A_{\ell;N}$ , denoted by $a_{\ell;N}(i,j)$ , has expected value

[TABLE]

Now, we have

[TABLE]

Now observe $\rho_{\ell}^{2}(\cdot)=(C_{\ell}(\cdot)/C_{\ell})^{2}$ , the squared autocorrelation function of the process, is nonnegative and summable; that is, there exists $\rho_{\ell}^{*}\in\mathbb{R}^{+}$ so that $\sum_{\tau=-\infty}^{+\infty}\rho^{2}_{\ell}(\tau)=\rho_{\ell}^{*}<\infty,$ and

[TABLE]

in view of the Cauchy-Schwartz inequality. Thus, it holds that

[TABLE]

On the other hand,

[TABLE]

and

[TABLE]

since

[TABLE]

and

[TABLE]

see also [8]. Then, $\rho_{\ell}^{*}\leq const$ , uniformly over $\ell$ .

In conclusion, uniformly over $\ell$ and $N$ ,

[TABLE]

$M>0$ .

Let us now focus on the elements of $\widetilde{B}_{\ell;N}=\Sigma_{\ell}^{-1}{B}_{\ell;N}$ ; they are given by

[TABLE]

These elements can be shown to satisfy the following properties:

(i)

${\mathbb{E}}\left[\widetilde{b}_{\ell;N}(j)\right]=\sum_{j=1}^{p}s_{\ell}(i,j){\mathbb{E}}[b_{\ell;N}(j)]$ = 0; 2. (ii)

${\mathbb{E}}\left[\widetilde{b}_{\ell;N}(i)\widetilde{b}_{\ell;N}(j)\right]=s_{\ell}(i,j)\frac{C_{\ell;Z}}{C_{\ell}}$ , since

[TABLE]

and because

[TABLE] 3. (iii)

$\text{Cum}_{4}\left[\widetilde{b}_{\ell;N}(i)\right]=\frac{6}{N(2\ell+1)}\left(s_{\ell}(i,i)\frac{C_{\ell;Z}}{C_{\ell}}\right)^{2}$ .

To compute $\text{Cum}_{4}\left[\widetilde{b}_{\ell;N}(i)\right]$ we use once again the multilinearity property of cumulants, the real expansion and the diagram formula, so that we obtain:

[TABLE]

with $\text{Cum}[b_{\ell;N}(j_{1}),b_{\ell;N}(j_{2}),b_{\ell;N}(j_{3}),b_{\ell;N}(j_{4})]=\text{Cum}(j_{1},j_{2},j_{3},j_{4})$ given by

[TABLE]

Hence,

[TABLE]

as claimed.

Proof (Lemma 2).

First, rewrite

[TABLE]

Since

[TABLE]

we have

[TABLE]

where

[TABLE]

and, from (3),

[TABLE]

By definition,

[TABLE]

Since $X^{\prime}_{\ell;N}X_{\ell;N}$ is a real symmetric $p\times p$ matrix,

[TABLE]

$X_{\ell;N}^{\prime}X_{\ell;N}$ can be seen as the sum of $2\ell+1$ independent matrix, i.e.

[TABLE]

where $X_{\ell,m;N}$ is a $N\times p$ matrix, defined by (recalling that $n=N+p$ )

[TABLE]

Then,

[TABLE]

Now recall that $\Sigma_{\ell}$ is the $p\times p$ matrix of autocorrelations; similarly we define $\Sigma_{\ell;N}$ as the $N\times N$ matrix of autocorrelations. Both are invertible since we assumed that the spectral density

[TABLE]

is a continuous positive function.

$X_{\ell,m;N}$ is a zero-mean Gaussian matrix with ${\mathbb{E}}[X_{\ell,m;N}X^{\prime}_{\ell,m;N}]=pC_{\ell}\Sigma_{\ell;N}$ and ${\mathbb{E}}[X^{\prime}_{\ell,m;N}X_{\ell,m;N}]=NC_{\ell}\Sigma_{\ell}$ , therefore it can be written as $X_{\ell,m;N}=(C_{\ell}\Sigma_{\ell;N})^{1/2}Z_{\ell,m;N}$ , where $Z_{\ell,m;N}$ is a zero-mean Gaussian matrix with independent rows. If $\Sigma_{\ell;N}=P\Lambda P^{\prime}$ , where $P$ is an orthogonal matrix of eigenvectors and $\Lambda$ is the diagonal matrix of eigenvalues, then

[TABLE]

where $Z^{\prime}_{\ell,m;N}Z_{\ell,m;N}$ is a Wishart random matrix with $N$ degrees of freedom. The same argument applies to all $2\ell+1$ components of (1), so that

[TABLE]

The summation in (1) includes $2\ell+1$ independent Wishart random matrix each with $N$ degrees of freedom and $\Sigma_{\ell}$ as scale matrix, then $Z^{\prime}_{\ell;N}Z_{\ell;N}$ is a Wishart random matrix with $N(2\ell+1)$ degrees of freedom and $\Sigma_{\ell}$ as scale matrix, and $\lambda_{\min}(Z^{\prime}_{\ell;N}Z_{\ell;N})$ its minimum eigenvalue. Furthermore, this result guarantees the invertibility of the matrix $X^{\prime}_{\ell;N}X_{\ell;N}$ .

By the standard inequality on trace and operator norms for matrices, we obtain that

[TABLE]

For $N(2\ell+1)>7+p$ the fourth moment of the trace of an inverse Wishart matrix is given in [26]:

[TABLE]

where $\eta=\frac{N(2\ell+1)}{2}-\frac{p+1}{2}$ , and

[TABLE]

If $\lambda_{\ell;1},\dots,\lambda_{\ell;p}$ are the eigenvalues of $\Sigma_{\ell}$ , we have

[TABLE]

$k\geq 1$ . Then, for $2\eta>7+p$ ,

[TABLE]

and

[TABLE]

In addition,

[TABLE]

for every $\ell\geq 0$ and $N>7+p$ . Thus, (6) holds.

The second part of this Lemma follows easily, indeed

[TABLE]

in view of the bounds that we just established on the fourth-moments of the norms of $A_{\ell,N}^{-1}$ and $B_{\ell,N}$ .

Proof (Lemma 3).

We first need to prove that $\lim_{\ell\to\infty}\Sigma_{\ell}=I_{p}$ , where we recall that $\Sigma_{\ell}$ is the matrix of autocorrelations $\rho_{\ell}(i-j)$ . For $i=j$ , $\rho_{\ell}(i-j)=1$ , for all $\ell$ ; on the other hand, for $i\neq j$ ,

[TABLE]

and

[TABLE]

For $\ell>0$ ,

[TABLE]

Moreover, since

[TABLE]

and

[TABLE]

we have

[TABLE]

as claimed.

The last proof is for the technical Lemma on summation of squared Legendre polynomials.

Proof (Lemma 4).

For $\ell\geq 1$ , by Hilb’s asymptotics (see [33],[35]), it holds that

[TABLE]

with $\alpha=\frac{\theta}{2}+\frac{\pi}{4}$ . Then,

[TABLE]

In view of the standard identities

[TABLE]

and

[TABLE]

we have

[TABLE]

hence,

[TABLE]

Also, it holds that if $\lim_{k\to\infty}a_{k}=A$ , $|A|<\infty$ , then $\lim_{k\to\infty}\frac{1}{n}\sum_{k=1}^{n}a_{k}=A$ . As a consequence,

[TABLE]

Likewise, for $\theta,\theta^{\prime}\in(0,\pi),\ \theta\neq\theta^{\prime}$ ,

[TABLE]

As before, we have

[TABLE]

hence,

[TABLE]

In addition, since $\sum_{\ell=1}^{L}\ell^{-1}=\mathcal{O}(\log L)$ , we can then conclude that

[TABLE]

as $L\to\infty$ .

Remark 5.

Note that (8) does not converge pointwise if $\theta$ or $\theta^{\prime}=0;$ for instance, for $\theta=\theta^{\prime}=0$ we have $\frac{1}{L+1}\sum_{\ell=0}^{L}(2\ell+1)=L+1,$ whereas for $\theta\neq 0$ , $\theta^{\prime}=0$ (4) oscillates among given constants.

Bibliography35

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Aue, A., van Delft, A. (2017) Testing for stationarity of functional time series in the frequency domain, ar Xiv preprint: 1701.01741
2[2] Baldi, P., Kerkyacharian, G., Marinucci, D., Picard, D. (2009) Asymptotics for spherical needlets, Annals of Statistics , 37, no. 3, 1150–1171.
3[3] Billingsley, P. (1999) Convergence of Probability Measures , second edition, Wiley Series in Probability and Statistics.
4[4] Berg, C., Porcu, E. (2017) From Schoenberg coefficients to Schoenberg functions, Constructive Approximations , 45, no. 2, 217–241.
5[5] Bosq, D. (2000) Linear processes in function spaces. Theory and applications. Lecture Notes in Statistics, 149, Springer-Verlag, New York.
6[6] Cammarota, V., Marinucci, D. (2015) On the limiting behaviour of needlets polyspectra, Ann. Inst. Henri Poincaré Probab. Stat. , 51, no. 3, 1159–1189.
7[7] Cammarota, V., Marinucci, D. (2018) A quantitative central limit theorem for the Euler-Poincaré characteristic of random spherical eigenfunctions, Annals of Probability , 46, n.6, 3188–3228
8[8] Caponera, A. (2019) Statistical Inference for Spherical Functional Autoregressions , Ph D Thesis, Sapienza University of Rome.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Asymptotics for Spherical Functional Autoregressions

Supplement to "Asymptotics for Spherical Functional Autoregressions"

Abstract

keywords:

keywords:

1 Introduction

2 Background and Notation

2.1 Spectral Representation of Isotropic Random Fields on the Sphere

2.2 Hermite Polynomials and Stein-Malliavin Results

3 Spherical Random Fields with Temporal Dependence

Definition 1**.**

Remark 2**.**

3.1 Spherical Autoregressions

Definition 3** (Gaussian Spherical White Noise).**

Remark 4**.**

Definition 5**.**

Definition 6**.**

Remark 7**.**

Condition 8**.**

Remark 9**.**

Example 10** (SPHAR(1)SPHAR(1)SPHAR(1)).**

Remark 11**.**

Condition 12** (Identifiability).**

Remark 13**.**

4 Main Results

Condition 14** (Smoothness).**

Remark 15**.**

Remark 16**.**

Theorem 17** (Consistency).**

Remark 18** (Optimal choice of ddd).**

Remark 19** (Plug-in estimates).**

Theorem 20**.**

Remark 21**.**

Corollary 22**.**

Remark 23**.**

Condition 24**.**

Theorem 25**.**

Remark 26**.**

5 Proofs of the Main Results

Proof (Theorem 17).

Proof (Theorem 20).

Proof (Theorem 25).

6 Some Numerical Evidence

1 Appendix

Lemma 1**.**

Lemma 2**.**

Lemma 3**.**

Lemma 4**.**

Proof (Lemma 1).

Proof (Lemma 2).

Proof (Lemma 3).

Proof (Lemma 4).

Remark 5**.**

Definition 1.

Remark 2.

Definition 3 (Gaussian Spherical White Noise).

Remark 4.

Definition 5.

Definition 6.

Remark 7.

Condition 8.

Remark 9.

Example 10 ( $SPHAR(1)$ ).

Remark 11.

Condition 12 (Identifiability).

Remark 13.

Condition 14 (Smoothness).

Remark 15.

Remark 16.

Theorem 17 (Consistency).

Remark 18 (Optimal choice of $d$ ).

Remark 19 (Plug-in estimates).

Theorem 20.

Remark 21.

Corollary 22.

Remark 23.

Condition 24.

Theorem 25.

Remark 26.

Lemma 1.

Lemma 2.

Lemma 3.

Lemma 4.

Remark 5.