On estimation and prediction in spatial functional linear regression   model

St\'ephane Bouka; Sophie Dabo-Niang; Guy Martial Nkiet

arXiv:1908.02143·math.ST·August 7, 2019

On estimation and prediction in spatial functional linear regression model

St\'ephane Bouka, Sophie Dabo-Niang, Guy Martial Nkiet

PDF

Open Access

TL;DR

This paper develops a smoothing spline estimator for spatial functional linear regression, providing finite sample bounds for variance and prediction error, supported by simulation studies.

Contribution

It introduces a novel estimator for the spatial functional linear model and derives finite sample bounds under spatial dependence.

Findings

01

Finite sample variance bounds for the estimator.

02

Prediction error bounds established.

03

Simulation results validate theoretical findings.

Abstract

We consider a spatial functional linear regression, where a scalar response is related to a square integrable spatial functional process. We use a smoothing spline estimator for the functional slope parameter and establish a finite sample bound for variance of this estimator under mixing spatial dependence. Then, we give a bound of the prediction error. Finally, we illustrate our results by simulations

Equations110

Y_{i} = β_{0} + \int_{I} β (t) X_{i} (t) d t + ϵ_{i}, i \in Z^{d}

Y_{i} = β_{0} + \int_{I} β (t) X_{i} (t) d t + ϵ_{i}, i \in Z^{d}

β (t)

β (t)

with β

E (f \in L_{r} in f t sup ∣ X (t) - f (t) ∣^{2}) \leq C_{2} r^{- 2 q} .

E (f \in L_{r} in f t sup ∣ X (t) - f (t) ∣^{2}) \leq C_{2} r^{- 2 q} .

V a r \frac{1}{n ^{d}} i = 1 \sum n^{d} ⟨ X_{i_{i}} - E (X), ζ_{j} ⟩ ⟨ X_{i_{i}} - E (X), ζ_{ℓ} ⟩

V a r \frac{1}{n ^{d}} i = 1 \sum n^{d} ⟨ X_{i_{i}} - E (X), ζ_{j} ⟩ ⟨ X_{i_{i}} - E (X), ζ_{ℓ} ⟩

\leq \frac{C _{3}}{n ^{d}} E (⟨ X - E (X), ζ_{j} ⟩^{2}) E (⟨ X - E (X), ζ_{ℓ} ⟩^{2})

Γ u := E (< u, X - E (X) > (X - E (X))),

Γ u := E (< u, X - E (X) > (X - E (X))),

α (U, V) = sup {∣ P (A \cap B) - P (A) P (B) ∣, A \in U, B \in V},

α (U, V) = sup {∣ P (A \cap B) - P (A) P (B) ∣, A \in U, B \in V},

α_{1, \infty} (u) = sup {α (σ (Z_{i}), F_{Λ}), i \in Z^{d}, Λ \subset Z^{d}, δ (Λ, {i}) \geq u},

α_{1, \infty} (u) = sup {α (σ (Z_{i}), F_{Λ}), i \in Z^{d}, Λ \subset Z^{d}, δ (Λ, {i}) \geq u},

C o v (X_{i_{i}} (t), X_{i_{j}} (u)) = g (∣ t - u ∣) Ψ (δ ({i_{i}}, {i_{j}})) and Ψ (0) = 1

C o v (X_{i_{i}} (t), X_{i_{j}} (u)) = g (∣ t - u ∣) Ψ (δ ({i_{i}}, {i_{j}})) and Ψ (0) = 1

∣∣ u ∣ ∣_{Γ}^{2} := ⟨ Γ u, u ⟩, u \in L^{2} ([0, 1]) .

∣∣ u ∣ ∣_{Γ}^{2} := ⟨ Γ u, u ⟩, u \in L^{2} ([0, 1]) .

∥ u ∥_{Γ_{n, p}}^{2} := \frac{1}{p} u^{T} (\frac{1}{n ^{d} p} X^{T} X) u .

∥ u ∥_{Γ_{n, p}}^{2} := \frac{1}{p} u^{T} (\frac{1}{n ^{d} p} X^{T} X) u .

E_{ϵ} (∥ β - E_{ϵ} (β) ∥_{Γ_{n, p}}^{2}) \leq (\frac{σ _{ϵ}^{2}}{n ^{d}} + \frac{c ln n}{n ^{d}}) (m + ⌊ ρ^{- 1/ (2 m + 2 q + 1)} ⌋ (2 + C . C_{0}))

E_{ϵ} (∥ β - E_{ϵ} (β) ∥_{Γ_{n, p}}^{2}) \leq (\frac{σ _{ϵ}^{2}}{n ^{d}} + \frac{c ln n}{n ^{d}}) (m + ⌊ ρ^{- 1/ (2 m + 2 q + 1)} ⌋ (2 + C . C_{0}))

∥ β - β ∥_{Γ}^{2} = O_{p} (ρ + (n^{d} ρ^{1/ (2 m + 2 q + 1)})^{- 1} ln n + n^{- d (2 q + 1) /2})

∥ β - β ∥_{Γ}^{2} = O_{p} (ρ + (n^{d} ρ^{1/ (2 m + 2 q + 1)})^{- 1} ln n + n^{- d (2 q + 1) /2})

δ ({i_{0}}, {i_{1}, ..., i_{n}}) \geq ⌊ n^{2 d / θ} ⌋

δ ({i_{0}}, {i_{1}, ..., i_{n}}) \geq ⌊ n^{2 d / θ} ⌋

Y_{i_{0}} = β_{0} + ⟨ β, X_{i_{0}} ⟩ an d Y_{i_{0}}^{*} = β_{0} + ⟨ β, X_{i_{0}} ⟩

Y_{i_{0}} = β_{0} + ⟨ β, X_{i_{0}} ⟩ an d Y_{i_{0}}^{*} = β_{0} + ⟨ β, X_{i_{0}} ⟩

E ((Y_{i_{0}} - Y_{i_{0}}^{*})^{2} ∣ β_{0}, β) = O_{p} (n^{- d / (2 m + 2 q + 2)})

E ((Y_{i_{0}} - Y_{i_{0}}^{*})^{2} ∣ β_{0}, β) = O_{p} (n^{- d / (2 m + 2 q + 2)})

X_{i_{ℓ}} (t) = k = 1 \sum 15 ξ_{i_{ℓ}, k} B_{k} (t) + Λ_{i_{ℓ}} (t) .

X_{i_{ℓ}} (t) = k = 1 \sum 15 ξ_{i_{ℓ}, k} B_{k} (t) + Λ_{i_{ℓ}} (t) .

Y_{i_{ℓ}}

Y_{i_{ℓ}}

s n r = \frac{E [⟨ β , X ⟩ ^{2} ]}{E [⟨ β , X ⟩ ^{2} ] + σ _{ϵ}^{2}},

s n r = \frac{E [⟨ β , X ⟩ ^{2} ]}{E [⟨ β , X ⟩ ^{2} ] + σ _{ϵ}^{2}},

M = (\frac{1}{n ^{d} p} X^{T} X + ρ A_{m})^{- 1} (\frac{1}{n ^{d} p} X^{T} X);

M = (\frac{1}{n ^{d} p} X^{T} X + ρ A_{m})^{- 1} (\frac{1}{n ^{d} p} X^{T} X);

M

M

M

M

t r (M^{2})

t r (M^{2})

Θ = X (\frac{1}{n ^{d} p} X^{T} X + ρ A_{m})^{- 1} (\frac{1}{n ^{d} p} X^{T} X) (\frac{1}{n ^{d} p} X^{T} X + ρ A_{m})^{- 1} X^{T},

Θ = X (\frac{1}{n ^{d} p} X^{T} X + ρ A_{m})^{- 1} (\frac{1}{n ^{d} p} X^{T} X) (\frac{1}{n ^{d} p} X^{T} X + ρ A_{m})^{- 1} X^{T},

E_{ϵ} (β - E_{ϵ} (β)_{Γ_{n, p}}^{2})

E_{ϵ} (β - E_{ϵ} (β)_{Γ_{n, p}}^{2})

E (τ_{i}^{2})

E (τ_{i}^{2})

j \neq = i j = 1 \sum n^{d} E (ϵ_{i_{i}} ϵ_{i_{j}}) = 0 < δ ({i_{j}}, {i_{i}}) \leq Q_{n} j = 1 \sum n^{d} E (ϵ_{i_{i}} ϵ_{i_{j}}) + δ ({i_{j}}, {i_{i}}) > Q_{n} j = 1 \sum n^{d} E (ϵ_{i_{i}} ϵ_{i_{j}}) .

j \neq = i j = 1 \sum n^{d} E (ϵ_{i_{i}} ϵ_{i_{j}}) = 0 < δ ({i_{j}}, {i_{i}}) \leq Q_{n} j = 1 \sum n^{d} E (ϵ_{i_{i}} ϵ_{i_{j}}) + δ ({i_{j}}, {i_{i}}) > Q_{n} j = 1 \sum n^{d} E (ϵ_{i_{i}} ϵ_{i_{j}}) .

j \neq = i j = 1 \sum n^{d} E (ϵ_{i_{i}} ϵ_{i_{j}})

j \neq = i j = 1 \sum n^{d} E (ϵ_{i_{i}} ϵ_{i_{j}})

j \neq = i j = 1 \sum n^{d} E (ϵ_{i_{i}} ϵ_{i_{j}}) \leq σ_{ϵ}^{2} ln (n) + K_{1},

j \neq = i j = 1 \sum n^{d} E (ϵ_{i_{i}} ϵ_{i_{j}}) \leq σ_{ϵ}^{2} ln (n) + K_{1},

E (τ_{i}^{2})

E (τ_{i}^{2})

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpatial and Panel Data Analysis · Statistical Methods and Inference · Spondyloarthritis Studies and Treatments

Full text

On estimation and prediction in spatial functional linear regression model

Stéphane BOUKA1, Sophie DABO-NIANG2,3 and Guy Martial NKIET1

1 URMI, Université des Sciences et Techniques de Masuku, Franceville, Gabon.

2Laboratoire LEM, CNRS 9221, Université de Lille, France.

3INRIA-MODAL, Lille, France.

E-mail : [email protected]; [email protected]; [email protected].

**Abstract.**We consider a spatial functional linear regression, where a scalar response is related to a square integrable spatial functional process. We use a smoothing spline estimator for the functional slope parameter and establish a finite sample bound for variance of this estimator under mixing spatial dependence. Then, we give a bound of the prediction error. Finally, we illustrate our results by simulations.

**AMS 1991 subject classifications: **60G60; 62F12.

Key words: Functional linear regression; spatial functional process ; mixing spatial dependence

1 Introduction

Consider the following spatial functional linear regression model where the spatial scalar response $(Y_{\mathbf{i}}\in\mathbb{R},\ \mathbf{i}\in D\subset\mathbb{Z}^{d})$ is related to a square integrable spatial functional process $(X_{\mathbf{i}}\in{\mathcal{F}},\ \mathbf{i}\in D\subset\mathbb{Z}^{d})$ through

[TABLE]

where $\beta_{0}$ is a constant, $I$ is the domain of $X_{\mathbf{i}}$ , ${\mathcal{F}}$ is a space of functions endowed with a semi-norm, $\beta$ is an unknown function representing the slope function, and $\left(\epsilon_{\mathbf{i}}\right)_{\mathbf{i}\in\mathbb{Z}^{d}}$ is a centered random spatial noise and with variance $\sigma^{2}_{\epsilon}>0$ . The functional linear regression with functional or scalar response has been the focus of various investigations. There exist many contributions in this field for non spatial data, and recent references are: [1], [6], [7], [8], [15], [16], [17], [18], [23]. This work is motivated by a large number of applications for which the data are of spatial nature. For example, non-parametric prediction from kriging methods for geostatistical functional data was tackled in [3], [4], [11], [12], [13], [14] and [19] whereas spatial autoregressive functional models were considered in [20, 21]. In this paper, we are interested in estimation of the slope function $\beta$ in model (1). To the best of our knowledge, this problem have not yet been considered for the basic spatial functional linear regression model, but only for non spatial data (e.g [7]) or for spatial linear regression model with derivatives (see [5]). The paper is organized as follow. The section 2 is devoted to the estimation of the estimator that will be use. Assumptions and main results are stated in Section 3, and a simulation study is given in Section 4. The proofs are postponed to Section 5.

2 Smoothing splines estimation of slope function

In this section, we give an estimator of $\beta$ in (1) by using an approach similar to the one of [7]. Since this procedure of estimation does not take into account the nature of the dependence of the data, we obtain an estimator that has the same form than that of [7]. The process $(X_{\mathbf{i}},Y_{\mathbf{i}})_{\mathbf{i}\in\mathbb{Z}^{d}}$ is defined on probability space $(\Omega,{\mathcal{A}},\mathbb{P})$ with the same distribution as a couple of variable $(X,Y)$ . For $\mathbf{n}=(n,\cdots,n)$ with $n\in\mathbb{N}^{*}$ , let ${\mathcal{I}}_{\mathbf{n}}:=\{1,\cdots,n\}^{d}$ be a grid of points in $\mathbb{Z}^{d}$ and consider observations $(X_{\mathbf{i}},Y_{\mathbf{i}})_{\mathbf{i}\in{\mathcal{I}}_{\mathbf{n}}}$ . We assume that the random functions $X_{\mathbf{i}}$ are observed at $p$ equidistant points $t_{1},...,t_{p}\in I:=[0,1]$ , with $t_{j}=\frac{j}{p}$ for all $j=1,...,p$ . By using the lexico-graphic order, the previous sample is rewritten as $\{(X_{\mathbf{i}_{i}},Y_{\mathbf{i}_{i}})\}_{1\leq i\leq n^{d}}$ , then we put $\mathbf{Y}=(Y_{\mathbf{i}_{1}}-\overline{Y},...,Y_{\mathbf{i}_{n^{d}}}-\overline{Y})^{T}$ (where $u^{T}$ denotes the transposed of $u$ ) and we consider the $n^{d}\times p$ matrix $\mathbf{X}$ with general term $X_{\mathbf{i}_{i}}(t_{j})-\overline{X}(t_{j})$ for $i=1,...,n^{d}$ , $j=1,...,p$ . Then, we consider the estimator $\widehat{\beta}$ of $\beta$ given by

[TABLE]

where $\rho>0$ is a smoothing parameter, $\mathbf{A}_{m}$ is a $p\times p$ symmetric matrix defined from B-splines (see [7] for details), $\mathbf{D}(t)=(D_{1}(t),\cdots,D_{p}(t))^{T}$ is a functional basis of the $p$ -dimensional linear space $NS^{m}(t_{1},\cdots,t_{p})$ of functions $v$ having a $m$ -th order derivative $v^{(m)}$ that belongs to $L^{2}([0,1])$ , and $\mathbf{D}$ is the $p\times p$ matrix with general term $D_{i}(t_{j})$ for $i,j=1,\cdots,p$ . For estimating the intercept $\beta_{0}$ we take $\widehat{\beta_{0}}=\overline{Y}-\left\langle\widehat{\beta},\overline{X}\right\rangle$ , where $\left\langle.,.\right\rangle$ denotes the usual inner product of $L^{2}([0,1])$ .

3 Assumptions and main results

In this section, we first introduce the assumptions that are needed to obtain the main results of the paper, then theorems that give the rate of convergence of the estimator $\widehat{\beta}$ and also that of the prediction at a non-visited site are established.

3.1 Assumptions

Assumption 1

$\beta$ * is $m$ -times differentiable and $\beta^{(m)}$ belongs to $L^{2}([0,1])$ .*

Assumption 2

There exist $\kappa\in]0,1[$ , $\delta_{1}>0$ and $C_{1}>0$ such that, for any $(t,s)\in I^{2}$ , $\mathbb{P}\left(|X(t)-X(s)|\leq C_{1}|t-s|^{\kappa}\right)\geq 1-\delta_{1}.$

Assumption 3

For $C_{2}\in\mathbb{R}_{+}^{\ast}$ and all $r\in\mathbb{N}^{\ast}$ there exists a $r$ -dimensional linear subspace ${\mathcal{L}}_{r}$ of $L^{2}([0,1])$ and a real $q\in]0,1[$ such that

[TABLE]

Assumption 4

For any $(j,\ell)\in\mathbb{N}^{\ast}$ ,

[TABLE]

where $0<C_{3}<\infty$ and $\left\{\zeta_{j}\right\}_{j\in\mathbb{N}^{\ast}}$ is a complete orthonormal system of eigenfunctions of the operator $\Gamma$ from $L^{2}([0,1])$ to itself defined by:

[TABLE]

each $\zeta_{j}$ being associated with the $j$ -th largest eigenvalue $\lambda_{j}$ .

Assumptions 1–4 are technical conditions that are similar to the ones considered in [7]. In order to give the remaining assumptions, let us first recall the notion of polynomial mixing dependence. Letting $\alpha$ be the $\alpha$ -mixing coefficient given, for two sub $\sigma$ -algebras ${\mathcal{U}}$ and ${\mathcal{V}}$ of ${\mathcal{A}}$ , by

[TABLE]

we consider the strong mixing coefficient (see [10]) related to a random field $(Z_{\mathbf{i}})_{\mathbf{i}\in\mathbb{Z}^{d}}$ , defined as

[TABLE]

where $F_{\Lambda}=\sigma(Z_{\mathbf{i}};\mathbf{i}\in{\Lambda})$ and the distance $\delta$ is defined for any subsets $\Gamma_{1}$ and $\Gamma_{2}$ of $\mathbb{Z}^{d}$ by $\delta(\Gamma_{1},\Gamma_{2})=\min\{||\mathbf{i}-\mathbf{j}||_{2},\mathbf{i}\in\Gamma_{1},\mathbf{j}\in\Gamma_{2}\}$ where $\|.\|_{2}$ is the usual Euclidean norm of $\mathbb{R}^{d}$ . Then, $(Z_{\mathbf{i}})_{\mathbf{i}\in\mathbb{Z}^{d}}$ is polynomial mixing if the related strong mixing coefficients satisfy $\alpha_{1,\infty}(u)=O(u^{-\theta})$ , $\theta>0$ .

Assumption 5

$\{\epsilon_{\mathbf{i}}\}_{\mathbf{i}\in\mathbb{Z}^{d}}$ * is a strictly stationary random field, polynomial mixing, independent of $\{X_{\mathbf{i}}\}_{\mathbf{i}\in\mathbb{Z}^{d}}$ and such that $\sup_{\mathbf{i}\in\mathbb{Z}^{d}}\left|\epsilon_{\mathbf{i}}\right|<M_{1}$ almost surely, where $M_{1}$ is a strictly positive constant.*

Assumption 6

$\left\{(X_{\mathbf{i}},Y_{\mathbf{i}})\right\}_{\mathbf{i}\in\mathbb{Z}^{d}}$ * is a strictly stationary and polynomial mixing random field.*

Assumption 7

There exists $M_{2}>0$ such that for all $\mathbf{i}\in\mathbb{Z}^{d}$ , $\left\|X_{\mathbf{i}}\right\|<M_{2}$ almost surely.

Assumptions 5 and 6 are classical assumptions (see [2]). Assumption 7 has already been made in some works (see, e.g., [18]).

Assumption 8

$X$ * is an isotropic process such that for all $t$ , $u$ in $[0,1]$ ,*

[TABLE]

where $g$ is a positive function and $\varPsi$ is a known $\mathbb{R}_{+}$ -valued decreasing function that verified $\sum^{\infty}_{t=1}t^{d-1}\varPsi(t)<\infty$ .

The separable covariance structure stated in Assumption 8 has also been used in [17]. Examples on isotropic spatial models can be founded in [9]. We may mention for instance, the exponential spatial model.

3.2 The results

We consider the semi-norm $\|.\|_{\Gamma}$ defined by

[TABLE]

and the discretized empirical semi-norm defined for any $\mathbf{u}\in\mathbb{R}^{p}$ as

[TABLE]

The following theorem gives a bound of the estimator’s variance. In this theorem, $\mathbb{E}_{\epsilon}$ refers to the conditional expectation given $X_{\mathbf{i}_{1}},...,X_{\mathbf{i}_{n^{d}}}$ .

Theorem 1

Under Assumptions 1, 5, 6 and 7 with $\alpha_{1,\infty}(u)=O(u^{-\theta})$ , $\theta>d$ , for all $\rho>n^{-2md}$ , if the eigenvalues $\lambda_{x,1}\geq\lambda_{x,2}\geq...\geq\lambda_{x,p}\geq 0$ of $1/(n^{d}p)\mathbf{X}^{T}\mathbf{X}$ satisfy $\sum^{p}_{j=r+1}\lambda_{x,j}\leq C.r^{-2q}$ with $C>0$ , $q>0$ and $r:=\lfloor\rho^{-1/(2m+2q+1)}\rfloor$ , then

[TABLE]

where $C_{0}>0$ , $c>0$ and $\lfloor x\rfloor$ stands the integer part of $x$ .

Using Theorem 1 and Arguing as in [7] , we obtain the Corollary below.

Corollary 1

*Under assumptions of Theorem 1 together with

Assumptions 2-4, as well as $n^{d}p^{-2\kappa}=O(1)$ , $\rho\rightarrow 0$ , $1/(n^{d}\rho)\rightarrow 0$ as $n,p\rightarrow\infty$ we have*

[TABLE]

Next, we give a bound for prediction error. For that, we assume what follows

Assumption 9

The non-visited site $\mathbf{i}_{0}$ is such that

[TABLE]

In this Assumption 9, it is sufficient to choice $\theta$ large for doing the prediction at any non-visited site.

we consider the prediction $\widehat{Y}_{\mathbf{i}_{0}}$ and the ”theoretical” prediction $Y^{*}_{\mathbf{i}_{0}}$ at a non-visited site $\mathbf{i}_{0}\in\mathbb{Z}^{d}$ such that $(X_{\mathbf{i}_{0}},Y_{\mathbf{i}_{0}})$ has the same distribution than $(X,Y)$ . In fact,

[TABLE]

We are interested by the bound of the prediction error between $\widehat{Y}_{\mathbf{i}_{0}}$ and $Y^{*}_{\mathbf{i}_{0}}$ .

Theorem 2

Suppose that assumptions of Corollary 1 together with assumptions 8–9 hold. If $\sum_{j\geq 1}\lambda^{1/4}_{j}<\infty$ , $2q>1$ , $\rho\sim n^{-d(2m+2q+1)/(2m+2q+2)}$ and $p$ is chosen sufficiently large compared to $n^{d}$ , then

[TABLE]

4 A simulation study

This section presents the results of simulations made in order to evaluate the performances of the proposed methods for slope estimation and prediction in the model (1). We computed estimation and prediction errors from simulated spatial data in $\mathbb{Z}^{2}$ . Using the lexico-graphic order, we generated a sample $\{(X_{\mathbf{i}_{\ell}},Y_{\mathbf{i}_{\ell}})\}_{1\leq\ell\leq n^{2}}$ as follows: we consider the $15$ -th first elements $B_{1},\cdots,B_{15}$ of the B-splines basis. For $k=1,\cdots,15$ , we generate a vector $(\xi_{\mathbf{i}_{1},k},\cdots,\xi_{\mathbf{i}_{n^{2}},k})^{T}$ from a normal distribution $\mathcal{N}(0,\Sigma^{1})$ in $\mathbb{R}^{n^{2}}$ , where $\Sigma^{1}$ is the $n^{2}\times n^{2}$ covariance matrix with general term $\Sigma^{1}_{ij}=\exp(-3\|\mathbf{i}_{i}-\mathbf{i}_{j}\|_{2})$ . Further, we generate a vector $(\Lambda_{\mathbf{i}_{1}}(t),\cdots,\Lambda_{\mathbf{i}_{n^{2}}}(t))^{T}$ from a normal distribution $\mathcal{N}(0,\Sigma^{2})$ in $\mathbb{R}^{n^{2}}$ , where $\Sigma^{2}$ is the $n^{2}\times n^{2}$ covariance matrix with general term $\Sigma^{2}_{ij}=0.09$ , and for $\ell=1,\cdots,n^{2}$ we take

[TABLE]

Considering $1001$ equispaced points in $[0,1]$ , we compute each $Y_{\mathbf{i}_{\ell}}$ by approximating the integral in equation (1) using the rectangular method. That gives

[TABLE]

where $t_{j}=\frac{j-1}{1000}$ , $j=1,\cdots,1001$ , the vector $(\epsilon_{\mathbf{i}_{1}},\cdots,\epsilon_{\mathbf{i}_{n^{2}}})^{T}$ is generated from a normal distribution $\mathcal{N}(0,\sigma^{2}_{\epsilon}\Sigma^{1})$ with $\sigma^{2}_{\epsilon}$ controlled by the signal-to-noise ratio (snr) defined by

[TABLE]

and $\beta$ is a given function. We considered two cases for the function $\beta$ given by:

Case A : $\beta(t)=[\sin(2\pi t^{3})]^{3}$ ;

Case B : $\beta(t)=(0.4-t)^{2}$ .

The estimator $\widehat{\beta}$ of $\beta$ in model $(\ref{rl1.1})$ is computed by using the function ”fregre.basis” of the $R$ fda package. We assess performance of our methods through the semi-norm $\|.\|_{\Gamma}$ defined in (15) for evaluating the estimation error between $\widehat{\beta}$ and $\beta$ , and through the mean squared error (MSE) for evaluating the prediction error between the prediction $\widehat{Y}_{\mathbf{i}_{0}}$ and the ”theoretical” prediction $Y^{*}_{\mathbf{i}_{0}}$ at the non-visited site $\mathbf{i}_{0}=(13.5,5)$ . $X_{\mathbf{i}_{0}}$ is obtained by the ordinary krigging method, and $\widehat{Y}_{\mathbf{i}_{0}}$ and $Y^{*}_{\mathbf{i}_{0}}$ are obtained as defined in (8). We take $snr=5\%,10\%$ and $n=10,15,20,25$ over $100$ replications and we obtain the following tables.

[TABLE]

Table 1: Estimation errors

[TABLE]

Table 2: Prediction errors at a non-visited site $\mathbf{i}_{0}=(13.5,5)$

Table 1 and Table 2 present, respectively, the obtained estimation errors and prediction mean squared errors for different sample sizes and snr. The site $\mathbf{i}_{0}=(13.5,5)$ is beyond the grid of size $n^{2}=10^{2}$ whereas it is inside the grid of size $n^{2}=15^{2}$ . We remark that when this point is inside the grid the prediction errors decrease as the sample size increases. Also, we see that estimation and prediction errors are small even when the sample size and the snr increase.

Conclusion

In this paper, we propose to study asymptotic properties of a smoothing splines estimator of slope function in a spatial functional linear regression model, where a scalar response is related to a square integrable spatial functional process. The originality of the proposed method is to consider spatially dependent data. The main difficult is technical, especially in the proof of the prediction error because of the presence of the data spatial dependency. The prediction proposed in this work is available as well as for the points inside the grid than those beyond the grid compared to [5] where the prediction is only available for the points beyond the grid. One can then see the proposed methodology as a good alternative to [7] when available data are spatially dependent.

5 Proofs

5.1 A useful lemma

Let

[TABLE]

then we have:

Lemma 1

$tr\left(\mathcal{M}^{2}\right)\leq\textrm{tr}\left(\mathcal{M}\right)$ .

Proof. Since $\mathbf{A}_{m}$ is a symmetric nonnegative matrix, its has a square root, denoted by $\mathbf{A}_{m}^{1/2}$ , that is also a symmetric nonnegative matrix. Denoting by $\mathbf{A}_{m}^{-1/2}$ the inverse of $\mathbf{A}_{m}^{1/2}$ and by $I_{p}$ the $p\times p$ identity matrix, we have:

[TABLE]

Then from the spectral decomposition $\frac{1}{n^{d}p}\mathbf{A}_{m}^{-1/2}\mathbf{X}^{T}\mathbf{X}\mathbf{A}_{m}^{-1/2}=\sum_{\ell=1}^{p}\mu_{\ell}\,u_{\ell}u_{\ell}^{T},$ where the $\mu_{\ell}$ ’s are the nonegative eigenvalues and $\left\{u_{\ell}\right\}_{1\leq\ell\leq p}$ is an orthonormal basis of $\mathbb{R}^{p}$ consisting of eigenvectors, it follows:

[TABLE]

Therefore, since $tr(\mathbf{A}_{m}^{-1/2}\,u_{\ell}u_{\ell}^{T}\mathbf{A}_{m}^{1/2})=tr(u_{\ell}^{T}\mathbf{A}_{m}^{1/2}\mathbf{A}_{m}^{-1/2}\,u_{\ell})=tr(u_{\ell}^{T}\,u_{\ell})=1$ , we deduce that $tr(\mathcal{M})=\sum_{\ell=1}^{p}\frac{\mu_{\ell}}{\mu_{\ell}+\rho}.$ Finally,

[TABLE]

$\Box$

5.2 Proof of Theorem 1

Putting

[TABLE]

we have

[TABLE]

where $\tau_{i}=\epsilon_{\mathbf{i}_{i}}-\overline{\epsilon}$ , with $\overline{\epsilon}=n^{-d}\sum_{j=1}^{n^{d}}\epsilon_{\mathbf{i}_{j}}$ . Putting $\sigma^{2}_{\epsilon}=\mathbb{E}\left(\epsilon^{2}_{\mathbf{i}_{i}}\right)$ , we deduce from $\tau^{2}_{i}=\epsilon^{2}_{\mathbf{i}_{i}}-2\epsilon_{\mathbf{i}_{i}}\overline{\epsilon}+\overline{\epsilon}^{2}$ and the strict stationarity that

[TABLE]

Notice that, putting $Q_{n}=\lfloor(\ln n)^{1/d}\rfloor$ , we have

[TABLE]

Then, using the Cauchy-Schwartz inequality as well as Lemma $2.1\ (ii)$ in [22], we obtain, under Assumption 1:

[TABLE]

where $b_{1}$ is a positive constant. Since $\theta>d$ , this finally gives:

[TABLE]

where $K_{1}$ is a positive constant. Therefore, from (11), it follows:

[TABLE]

Clearly, $\sum^{n^{d}}_{i=1}\Theta_{ii}=tr(\Theta)=n^{d}p\,tr(\mathcal{M}^{2})$ , where $\mathcal{M}$ is defined in (9). Then, from Lemma 1, and the proof of Theorem 1 in Crambes et al (2009) (see p. 55-56) that shows that $tr(\mathcal{M})\leq m+\rho^{-1/(2m+2q+1)}(2+C.C_{0})$ where $C$ and $C_{0}$ are positive constants, it follows:

[TABLE]

Then, we deduce from (13) and (14) that

[TABLE]

where $c_{1}$ is a positive constant. On the other hand,

[TABLE]

Then, using (12), we obtain $|\mathbb{E}\left(\tau_{i}\tau_{j}\right)|\leq|\mathbb{E}\left(\epsilon_{\mathbf{i}_{i}}\epsilon_{\mathbf{i}_{j}}\right)|+\frac{\sigma^{2}_{\epsilon}}{n^{d}}+\frac{3}{n^{d}}\left(\sigma^{2}_{\epsilon}\ln(n)+K_{1}\right)$ , and

[TABLE]

where $K_{2}$ is a positive constant. Note that $\Theta=B^{2}$ , where

[TABLE]

then

[TABLE]

and, putting $S=\frac{1}{n^{2d}p}\sum^{n^{d}}_{i=1}\sum_{\stackrel{{\scriptstyle j=1}}{{j\neq i}}}^{n^{d}}\Theta_{ij}\mathbb{E}\left(\tau_{i}\tau_{j}\right)$ , we deduce from this inequality and from (14) and (16) that

[TABLE]

$\Box$

5.3 Proof of Theorem 2

5.3.1 Lemma

Lemma 2

Under assumptions Theorem 2, we have

[TABLE]

5.3.2 Proofs

Proof of Lemma 2 :

We have

[TABLE]

On the one hand, from assumption 1, we have $\|\beta\|^{2}<C_{2}<\infty$ . On the other hand, for $p$ large enough, we have

[TABLE]

Set $\mathbf{V}=(V_{1}-\overline{V},\cdots,V_{n^{d}}-\overline{V})^{T}$ , where $V_{\ell}=\int^{1}_{0}\beta(t)X_{\mathbf{i}_{\ell}}(t)dt-\frac{1}{p}\sum^{p}_{j=1}\beta(t_{j})X_{\mathbf{i}_{\ell}}(t_{j})$ , $\ell=1,\cdots,n^{d}$ . Then, by definition of $\widehat{\boldsymbol{\beta}}$ , we have

[TABLE]

The first and second term on the right-hand side of (5.3.2) are bounded as in [7] (see p. 57), that is to say

[TABLE]

Set $W=\dfrac{1}{n^{d}p}\mathbf{X}\left(\frac{1}{n^{d}p}\mathbf{X}^{T}\mathbf{X}+\rho\mathbf{A}_{m}\right)^{-2}\mathbf{X}^{T}=BB^{T}$ where

$B=\dfrac{1}{\sqrt{n^{d}p}}\mathbf{X}\left(\frac{1}{n^{d}p}\mathbf{X}^{T}\mathbf{X}+\rho\mathbf{A}_{m}\right)^{-1}$ . We have

[TABLE]

from Assumption 5, we have

[TABLE]

and since

[TABLE]

it follows that

[TABLE]

We then obtain the result of Lemma 2. $\Box$

Proof of Theorem 2 :

[TABLE]

Since, from assumption 7 and Lemma 2, we have

[TABLE]

and $\mathbb{E}\left(\left\langle X_{\mathbf{i}_{0}}-\mathbb{E}(X_{\mathbf{i}_{0}}),\zeta_{j}\right\rangle^{4}\right)\leq 4M^{2}_{2}\left\langle\Gamma\zeta_{j},\zeta_{j}\right\rangle$ , it follows from Lemma $2.1\ (i)$ in [22], Assumption 7 and Lemma 2 that

[TABLE]

where $C$ is a positive constant. However, we have from Lemma $2.1\ (i)$ in [22] that

[TABLE]

Since $\left\langle\Gamma\zeta_{\ell},\zeta_{j}\right\rangle=\lambda_{\ell}\left\langle\zeta_{\ell},\zeta_{j}\right\rangle=\lambda_{\ell}$ if $\ell=j$ and [math] otherwise, it follows from Lemma 2 and Assumption 7 that

[TABLE]

where $C_{1}$ is a positive constant. From assumption 9 and Jensen inequality, we have

[TABLE]

where $C_{1}$ , $C_{2}$ and $C_{3}$ are positive constants. From Assumption 7 and Lemma 2, we have

[TABLE]

where $K_{1}$ , $K_{2}$ , $K_{3}$ and $K_{4}$ are positive constants. Then

[TABLE]

where $K_{7}$ and $K_{6}$ are positive constants. Applying Corollary 1 with $2q>1$ , $\rho\sim n^{-d(2m+2q+1)/(2m+2q+2)}$ , we obtain the result of Theorem 2.

$\Box$

Bibliography23

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Aue A, Norinho DD, Hormann S. On the Prediction of Stationary Functional Time Series. J. Amer. Statist. Assoc. 2015; 110:378–392.
2[2] Biau G, Cadre B. Nonparamatric Spatial Prediction. Stat. Inference Stoch. Process. 2004; 7:327–349.
3[3] Bohorquez M, Giraldo R, Matheu J. Optimal sampling for spatial prediction of functional data. Stat. Methods Appl. 2016; 25: 39–54.
4[4] Bohorquez M, Giraldo R, Matheu J. Multivariate functional random fields: prediction and optimal sampling. Stoch. Environ. Res. Risk Assess. 2017; 31:53–70.
5[5] Bouka S, Dabo-Niang S, Nkiet GM. (2018). On estimation in spatial functional regression with derivatives. C. R. Acad. Sci. Paris Ser. I 2018; 356:558–562.
6[6] Comte F, Johannes J. Adaptive functional linear regression. Ann. Statist. 2012; 40:2765–2797.
7[7] Crambes C, Kneip A, Sarda P. Smoothing splines estimators for functional linear regression. Ann. Statist. 2009; 37: 35–72.
8[8] Cuevas A. A partial overview of the theory of statistics with functional data. J. Statist. Plan. Inf. 2014; 147:1-23.