Non-parametric estimation of time varying AR(1)--processes with local stationarity and periodicity
Jean-Marc Bardet (SAMM), Paul Doukhan (AGM)

TL;DR
This paper develops a kernel-based non-parametric method for estimating time-varying AR(1) processes with local stationarity and periodicity, providing theoretical guarantees and minimax rates under mild conditions.
Contribution
It introduces a novel estimation approach for a new class of periodic, locally stationary AR(1) processes with proven asymptotic properties.
Findings
Kernel estimators reach classical minimax rates.
Establishment of central limit theorems for the estimators.
Method requires only second-order moments of noise.
Abstract
Extending the ideas of [7], this paper aims at providing a kernel based non-parametric estimation of a new class of time varying AR(1) processes (Xt), with local stationarity and periodic features (with a known period T), inducing the definition Xt = at(t/nT)X t--1 + t for t N and with a t+T at. Central limit theorems are established for kernel estima-tors as(u) reaching classical minimax rates and only requiring low order moment conditions of the white noise (t)t up to the second order.
| Kernel | |||||||||
|---|---|---|---|---|---|---|---|---|---|
| 0.243 | 0.407 | 0.283 | 0.450 | 0.172 | 0.322 | 0.235 | 0.392 | ||
| 0.248 | 0.239 | 0.286 | 0.282 | 0.230 | 0.234 | 0.354 | 0.353 | ||
| 0.227 | 0.363 | 0.278 | 0.429 | 0.256 | 0.392 | 0.250 | 0.386 | ||
| 0.185 | 0.175 | 0.219 | 0.219 | 0.232 | 0.232 | 0.308 | 0.303 | ||
| 0.234 | 0.320 | 0.276 | 0.399 | 0.321 | 0.431 | 0.287 | 0.406 | ||
| 0.129 | 0.119 | 0.154 | 0.156 | 0.213 | 0.210 | 0.256 | 0.254 | ||
| 0.240 | 0.321 | 0.270 | 0.384 | 0.373 | 0.476 | 0.328 | 0.438 | ||
| 0.098 | 0.093 | 0.124 | 0.122 | 0.207 | 0.202 | 0.226 | 0.221 | ||
| Kernel | |||||||||
|---|---|---|---|---|---|---|---|---|---|
| 0.226 | 0.394 | 0.267 | 0.430 | 0.161 | 0.295 | 0.220 | 0.360 | ||
| 0.341 | 0.320 | 0.350 | 0.340 | 0.311 | 0.309 | 0.418 | 0.405 | ||
| 0.207 | 0.343 | 0.259 | 0.402 | 0.231 | 0.355 | 0.225 | 0.362 | ||
| 0.261 | 0.258 | 0.281 | 0.287 | 0.296 | 0.293 | 0.353 | 0.346 | ||
| 0.194 | 0.304 | 0.252 | 0.373 | 0.286 | 0.383 | 0.239 | 0.360 | ||
| 0.214 | 0.201 | 0.213 | 0.217 | 0.269 | 0.261 | 0.302 | 0.296 | ||
| 0.193 | 0.321 | 0.246 | 0.342 | 0.346 | 0.450 | 0.258 | 0.368 | ||
| 0.166 | 0.093 | 0.172 | 0.181 | 0.258 | 0.250 | 0.262 | 0.275 | ||
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsControl Systems and Identification · Statistical Methods and Inference
Non-parametric estimation of time varying AR(1)–processes with local stationarity and periodicity
Jean-Marc Bardetlabel=e1 [
mark][email protected]
Paul Doukhanlabel=e2 [
mark][email protected]
SAMM EA4543, University Panthéon-Sorbonne, 90, rue de Tolbiac, 75634, Paris, France.
AGM-UMR8088, University Cergy-Pontoise, France, and CIMFAV, Valparaiso, Chile.
Some University and Another University
(0)
Abstract
Extending the ideas of [7], this paper aims at providing a kernel based non-parametric estimation of a new class of time varying AR processes , with local stationarity and periodic features (with a known period ), inducing the definition for and with . Central limit theorems are established for kernel estimators reaching classical minimax rates and only requiring low order moment conditions of the white noise up to the second order.
62G05,
62M10,
60F05,
Local stationarity,
Nonparametric estimation,
Central limit theorem,
keywords:
[class=AMS]
keywords:
††volume: 0††issue: 0
and
This paper is dedicated to the memory of Jean Bretagnolle
1 Introduction
Since the seminal paper [5], the local-stationarity property provides new models and approaches for introducing non-stationarity in times series. The recently published handbook [7] gives a complete survey about new results obtained since years on this topics.
An interesting new kind of models is obtained from a natural extension of usual ARMA processes, so called tvARMA()–processes defined in [8], as:
[TABLE]
where and are bounded functions. This is a special case of locally stationary linear process defined by X^{(n)}_{t}=\sum_{j=0}^{\infty}\gamma_{j}\Big{(}\frac{t}{n}\Big{)}\,\xi_{t-j}. Such models have been studied in many papers, especially concerning the parametric, semi-parametric or non-parametric estimations of functions , or , or other functions depending on these functions; see, for instance references [6], [8], [7], or [12], [3], [11], [17] or [2].
For simplicity, we restrict in this first work to time-varying AR–processes including a periodic component:
[TABLE]
where is a fixed and known integer number, and a white noise. Note that given the functions , one may even build a periodic sequence through the relation .
The choice of such extension of the tvAR processes is relative to modelling considerations: for instance, in the climatic framework, [4] considered models of air temperatures where the function of interest writes as the product of a periodic sequence by a locally varying function. This choice provide an interesting extension of more classical periodic models of air temperature such as those proposed in [14].
Other periodic representation for locally stationary processes can also be found in for instance in the paper [19], but the seasonal component is treated as an additive deterministic trend and is not included in the dynamic of the process, which is the case for model (1.2).
We then study non-parametric estimators , for , from an observed trajectory . We consider kernel-based estimators which are naturally induced from covariance relationships satisfied by the process (see Section 2). Central limit theorems are established for these estimators under some regularity conditions on the functions for . The results are only obtained by assuming second-order moments on the white noise . This is a main improvement with respect to usual limit theorems on locally-stationary processes which are obtained with the assumption that any moment exists for . This is due to the new ideas developed in our proof which combines a central limit theorem for martingale increment arrays as well as an embedding in an Orlicz space (see details in Section 4).
The obtained convergence rate is optimal with respect to the minimax rate up to a logarithmic term. Simulations based on Monte-Carlo experiments illustrate the accuracy of the estimators. An application to real-life data, i.e. monthly average temperature readings in London from 1659 to 1998, shows the interest of using our new model (1.2).
This paper is also a first step concerning new results for new class of non-stationary processes. Indeed, we can extend the definition (1.2) to processes such as:
[TABLE]
where is a sequence of i.i.d. random vectors modelling for instance exogenous inputs. This more tough case is deferred to forthcoming papers.
Other time-varying models with an infinite memory may also be processed as GARCH-type models (see for instance [9]). Remark also that [10] introduced INGARCH-models. Those models are GLM models; non-stationary versions of which also may be considered. They will be considered in further works.
The structure of the paper is as follows. In Section 2, we define and study asymptotic properties of non-parametric estimators for the process (1.2). Section 3 provides the results of some Monte-Carlo experiments and real-life data application, while the proofs are reported in Section 4.
2 Asymptotic normality of a non-parametric estimator for periodic tvAR(1) processes
2.1 Definition and first properties of the process
Denote classically and . Here we consider a fixed and known period. We will write if is a multiple of .
The paper is dedicated to the simplest case , of a periodic locally stationary ARprocess, defined in (1.2) where with . Here is a sequence of i.i.d. r.v.s satisfying and for any , with independent of .
The functions , are supposed to satisfy some regularity. Hence, we provide the forthcoming definition usually made in a non-parametric framework:
Definition 2.1**.**
For , we denote the largest integer such that . A function is said to belong to the class where is a neighbourhood of , if and if is a -Hölderian function, i.e. there exists such as
[TABLE]
In case is an integer we simply assume that exists and is a continuous and bounded function on the neighbourhood of . As a consequence we specify the assumptions on functions using a fixed positive real number :
Assumption (A): The functions are such as:
(Periodicity) There exists such that for any . 2. 2.
(Contractivity) There exists . 3. 3.
(Regularity) For any , assume that .
Remark 2.1**.**
Quote that corresponds to a non-periodic case and is then a usual tvAR(1) process defined in (1.1).
First it is clear that the conditions on functions ensure the existence of a causal linear process for any satisfying (1.2). More precisely, we obtain the following moment relationships:
Proposition 2.1**.**
Let satisfy (1.2) under Assumption (A) with . Then for some convenient constant ,
For any and , \big{|}\mathbb{E}\big{(}X^{(n)}_{t}\big{)}\big{|}\leq\alpha^{t}\,\big{|}\mathbb{E}(X_{0})\big{|}. 2. 2.
Let . There exists functions such as if and :
[TABLE] 3. 3.
Assume and (this holds e.g. if admits a symmetric distribution).
For , there exist functions such as, for with ,
[TABLE]
Moreover, for any with ,
[TABLE]
We will now assume .
In addition of the previous proposition, another relation can be easily established. Indeed, for , with , by multiplying (1.2) by and taking the expectation:
[TABLE]
The relation (2.4) is at the origin of the definition of the following non-parametric estimators of the functions .
2.2 Asymptotic normality of the estimator
Assume that the sample is observed for some ; this condition entails a reasonable loss of at most data and allows us for a more comprehensive study.
For each , we define I_{n,s}=\big{\{}s,s+T,\ldots,s+(n-1)T\big{\}}, a set with . Now (2.4) writes:
[TABLE]
A convolution kernel will be required in the sequel and it satisfies one of both the following assumptions:
Assumption : Let be a Borel bounded function such that:
- •
and for any ;
- •
there exists such as .
Assumption : Let be a Borel bounded function such that:
- •
and for any ;
- •
there exists some such as , if .
Typical examples of kernel functions are and satisfying respectively Assumptions and . Note also the would exclude dealing with a regularity .
For , we also specify another condition satisfied by such a function:
Assumption ker: Let be a Borel bounded function such that:
- •
and , if ;
- •
and .
Assume that a sequence of positive bandwidths is chosen in such a way that
[TABLE]
Now, keeping in mind the expression (2.4) and following the same ideas as with Nararaya-Watson estimator (see [18] and [22]), for and , we set
[TABLE]
Since extremities are omitted we avoid the corresponding edge effects due to the fact that at the extremities, summations are not considered over a symmetric interval of times containing . The case does not make any contribution while the case corresponds with simple periodic behaviours and such results should be found in [14].
Using essentially a martingale central limit theorem (the steps of the proofs are precisely detailed in Section 4), we obtain:
Theorem 2.1**.**
Let and Assumption (A), let satisfy Assumption or as well as Assumption ker. Then, for a sequence of positive real numbers such as ,
[TABLE]
for any , with
Note that for the classical optimal semi-parametric minimax rate is reached.
This is not the case if . In that case, another moment condition is needed in order to improve the convergence rate of .
Theorem 2.2**.**
*Let and Assumption (A), let satisfy Assumption or as well as Assumption ker. Moreover, suppose that with \displaystyle\beta=4-\frac{2\rho}{5\rho-4}\in\Big{[}2,\frac{10}{3}\Big{]} (Note that if ) and that admits a symmetric distribution. Then (2.8) holds for a sequence of positive real numbers such as b_{n}\,n^{\frac{1}{2\rho+1}}\begin{array}[t]{c}\stackrel{{\scriptstyle}}{{\longrightarrow}}\\ {\scriptstyle n\rightarrow+\infty}\end{array}0.
Moreover in case and if then the central limit still holds but the limit distribution is now non-centred:*
[TABLE]
*with \displaystyle\mu(u)=\frac{c^{\frac{5}{2}}}{\gamma^{(2)}_{s}(u)}\Big{(}\frac{1}{2}a_{s}^{\prime\prime}(u)\gamma_{s}^{(2)}(u)+a_{s}^{\prime}(u)(\gamma_{s}^{(2)})^{\prime}(u)\Big{)}\int_{\mathbb{R}}z^{2}K(z)\,dz. *
Remark 2.2**.**
*Optimal window widths write as thus the above result holds with a suboptimal window width. Moreover the symmetry assumption is discussed in Remark 4.2. Now for the case in case the derivatives of are regular around the point , then the optimal window width actually may be used and the central limit theorem again holds with a non-centred Gaussian limit.
Quote that the proposed normalisation yields the standard minimax rates , in the case of compactly supported symmetric kernel (a loss is observed for the Gaussian kernel); the obtained rates are in probability and further work is needed to prove that this is the minimax rate.
Moreover for large the convergence rate is degraded with a factor since the sample size is and thus .*
Remark 2.3**.**
Of course, if , Theorems 2.1 and 2.2 hold, which provide another minimax estimation of the function () requiring sharper moment and regularity conditions than the ones proposed in Theorem 4.1 of [8].
Remark 2.4**.**
*If is unknown we better consider an -sample and set , the proof of previous central limit theorem 2.1 provides an approach for estimating this period . First fix (typically for monthly data). Then, for each , we define an estimator for any and . It is clear that when is not a multiple of , then the sums in (2.7) that are done on the set , which depends on , is now a sum involving other with . As a consequence, is not a convergent estimator of .
Then, using a classical cross-validation, for each , we compute*
[TABLE]
Finally, define as the smallest value such as
[TABLE]
Remark 2.5**.**
The central limit theorem 2.1 naturally provides a test statistics for solving the test problem: versus , where . Indeed, from (2.8) and Slutsky Lemma we deduce:
[TABLE]
Then if we consider
[TABLE]
this provides a natural statistics test with usual standard Gaussian quantile as asymptotic threshold.
3 Monte-Carlo experiments and an application to climatic data
3.1 Monte-Carlo experiments
In this section, numerous Monte-Carlo experiments have been made for studying the accuracy of the new non-parametric estimator .
Firstly, we considered typical functions , and such as :
- •
For , we choose \displaystyle a_{s}^{(2)}(u)=0.9\,\cos\big{(}2\pi\frac{ns}{T}\big{)}\cos(3u). Figure 1 exhibits the graph of the function and an example of its estimation (for );
- •
For , we choose \displaystyle a_{s}^{(1.5)}(u)=0.9\,\cos\big{(}2\pi\frac{ns}{T}\big{)}\frac{\int_{0}^{u}W_{t}(\omega)\,dt}{\sup_{x\in[0,1]}|W_{x}(\omega)|} where is an observed trajectory of a Wiener Brownian motion;
- •
For , we choose \displaystyle a_{s}^{(0.8)}(u)=0.9\,\cos\big{(}2\pi\frac{ns}{T}\big{)}\frac{B_{0.8}(\omega,u)}{\sup_{x\in[0,1]}|B_{0.8}(\omega,x)|} where is an observed trajectory of a fractional Brownian motion with Hurst exponent (Figure 2 exhibits the graph of this chosen function ). It is well known that a trajectory of a fractional Brownian motion with Hurst exponent is almost surely -Höderian for any ;
- •
For , we choose \displaystyle a_{s}^{(0.5)}(u)=0.9\,\cos\big{(}2\pi\frac{ns}{T}\big{)}\frac{W_{u}(\omega)}{\sup_{x\in[0,1]}|W_{x}(\omega)|} where is an observed trajectory of a Wiener Brownian motion.
We also consider two “typical” kernels:
- •
A bounded supported kernel, the well-known Epanechnikov kernel defined by , which is known to minimize the asymptotic MISE in the kernel density estimation frame;
- •
The unbounded supported Gaussian kernel with K_{G}(x)=\frac{1}{\sqrt{2\pi}}\exp\big{(}-\frac{x^{2}}{2}\big{)}.
We considered the cases and , and we fixed . Finally independent replications of are generated with two different cases of innovations :
- •
Firstly, the case where the probability distribution of is a Gaussian distribution, then and therefore Theorem 2.1 holds for and Theorem 2.2 holds for and .
- •
Secondly, the case where the probability distribution of is a Student (with degrees of freedom) distribution implying for any but . Then if , Theorem 2.1 holds but if and , Theorem 2.2 does not hold.
Finally, for each , each functions and kernel , and each probability distributions of , we present the results computed from replications and the following methodology:
For each replication , we defined with , , , and the estimators are computed. 2. 2.
For each replication and each , an estimator of the is computed:
[TABLE] 3. 3.
For each replication , we minimised an estimator of the global square root of MISE:
[TABLE] 4. 4.
Then we computed over all the replications. 5. 5.
Finally, we computed the estimator of the minimal global square root of MISE,
[TABLE]
As a consequence, and are two interesting estimators relative to Theorems 2.1 and 2.2. The first one specifies the link between the choice of an optimal bandwidth qnd the regularity of the functions . The second one measures the optimal convergence rate of the estimators to . All the results are printed in Tables 1 and 2.
Moreover, for exhibiting the asymptotic normality of the estimators provided in the central limit theorem (2.8), we draw in Figure the histograms of for and from independent replications for . We also used a Jarque-Bera test to confirm the Gaussian asymptotic distribution since the p-values of this test are successively: , and . Hence, the asymptotic normality of the estimator seems to be attested by Monte-Carlo experiments.
Conclusions of the simulations: Firstly, and as it should be deduced from Theorem 2.1 and 2.2, we observed the larger the regularity , the smaller and therefore the larger the optimal bandwidth , and the faster the convergence rate of . Secondly, even if the choice of the optimal bandwidth is significantly different following the choice of the kernel (clearly smaller with the Epanechnikov kernel), the optimal convergence rate is almost the same for both the kernel. Finally, according also with Theorem 2.2, the convergence rate is clearly slower with a heavy tail distribution () than with a Gaussian distribution, and this phenomenon increases when increases.
3.2 Numerical application on climatic data
We also applied our model and its estimator to an example of real data, specifically the monthly average temperature readings in London from 1659 to 1998, or 340 years. Obviously in such a case one can expect that .
First, we removed an additive seasonal and trend component (estimated by LOESS) from these data and considered the residual data. On these, a global correlogram (see Figure 4) confirms a modelling by a process of type AR() and also the presence of a periodic phenomenon of period .
As a consequence we may assume that these residual data can be modelled by the model (1.2). We then applied the estimator for and and . Figure 5 summarizes these results and shows:
- •
The crucial interest of taking a pseudo-periodic model as we defined it in (1.2);
- •
The relatively small but not negligible change in the coefficient as a function of .
4 Proofs
We first provide the proof of Proposition 2.1.
Proof of Proposition 2.1.
We have \mathbb{E}X_{1}^{(n)}=a_{1}\Big{(}\frac{1}{nT}\Big{)}\mathbb{E}(X_{0}) and \mathbb{E}X_{t}^{(n)}=a_{t}\Big{(}\frac{t}{nT}\Big{)}\mathbb{E}X_{t-1}^{(n)}) from the relation (1.2). From Assumption (A) and since \Big{|}a_{1}\Big{(}\frac{1}{nT}\Big{)}\Big{|}\leq\alpha<1, we deduce the first item of Proposition 2.1. 2. 2.
Below, for ease of reading, we will omit the exponent . Set v_{t}=\mathbb{E}\big{(}X_{t}^{2}\big{)}, and ; also write \alpha_{t}=a^{2}_{t}\big{(}\frac{t}{nT}\big{)}. We have:
[TABLE]
thus
[TABLE]
Moreover, with for any , we have
[TABLE]
from (4.2) and since for some constant ,
[TABLE]
from Assumption (A). As a consequence of (4.3), we also obtain:
[TABLE]
Thus for other constants we derive
[TABLE]
From now on, assume that .
Now use again the definition (1.2) of the model, and by iterating (4.1), we derive:
[TABLE]
from (4.5).
Hence,
[TABLE]
Now quoting that \displaystyle\alpha_{t-j}=a_{t-j}^{2}\big{(}\frac{t-j}{nT}\big{)} we set \displaystyle\widetilde{\alpha}_{t-j}=a_{t-j}^{2}\big{(}\frac{t}{nT}\big{)} for , then since and from (4.6) we derive
[TABLE]
The conclusion follows. 3. 3.
The proof mimics the case of . Denote q_{t}=a_{t}^{4}\big{(}\frac{t}{nT}\big{)}, and , for . Then and
[TABLE]
Since , we have:
[TABLE]
with and this implies as previously . We also obtain for constants again denoted :
[TABLE]
Finally by iterating (4.8), we obtain:
[TABLE]
from (4.9). Hence, always following the previous case
[TABLE]
for , and this implies (2.2) from using again the regularity of the functions .
Finally, for any such that , since is a causal process and by iteration,
[TABLE]
where and \displaystyle\Big{|}\prod_{i=1}^{t-t^{\prime}}\alpha_{t^{\prime}+i}\Big{|}\leq\alpha^{2|t-t^{\prime}|}.
This completes the proof.
Now we establish a technical lemma, which we were not able to find in the past literature (even if variants of this result may be found) and that will be extremely useful in the sequel. For a bounded continuous function defined on , and a kernel function (see details below), an approximation of integral by appropriate Riemann sums yields (as for [20]’s estimator, see [21] for further developments):
[TABLE]
where , I_{n,s}=\big{\{}s,s+T,\ldots,s+(n-1)T\} with and . More precisely we would like to provide expansions of
[TABLE]
Lemma 4.1**.**
Let , , a bounded function. Let satisfy ker. Consider also a sequence of positive real numbers satisfying . Then, there exists depending only on , and such that for large enough
[TABLE]
Finally, if we have:
[TABLE]
Proof of Lemma 4.1. In the sequel we will denote h_{n}(v)=\frac{1}{b_{n}}H\big{(}b_{n}^{-1}(v-u)\big{)} for . Then is a Lipschitz function with .
- •
*First assume that the function is a constant. * Set for . For , we consider the sets
[TABLE]
and . Then, for large enough,
[TABLE]
But |h_{n}(v_{s+jT})|\leq\frac{C}{b_{n}}\,\exp\Big{(}-\beta\Big{|}\frac{j/n-u+s/nT}{b_{n}}\Big{|}\Big{)} from Assumption and using the usual comparison between sums and integrals for monotonic functions, we obtain:
[TABLE]
Thus
[TABLE]
because since , the above indices remain in the index set for large enough.
Then, if then and we deduce (4.11).
- •
We now turn to the case of a non-constant function . First, if , for the Taylor-Lagrange formula implies:
[TABLE]
with and . Since ,
[TABLE]
Therefore,
[TABLE]
with . Then for any , using Assumption ker and especially the relation for ,
[TABLE]
with . Here we denote for .
Now, if , we have
[TABLE]
and therefore using the previous results:
[TABLE]
from (4.15) and this implies (4.11) since and therefore is negligible with respect from .
Now, if and since and are bounded continuous Lipschitz functions, we obtain the inequality
[TABLE]
Then, using the same computations than previously (replace by ),
[TABLE]
from (4.15) and this completes the first item since is supposed to converge to [math]. The proof is now easily completed.
- •
Finally, in the case , we can use the previous case an a Taylor-Lagrange expansion of the function , implying \displaystyle R(u,v)=\frac{c^{(\rho)}(\theta)}{\rho!}\,\big{|}u-v\big{|}^{\rho} with and .
Then, using (• ‣ 4) and with , and
[TABLE]
from Lebesgue theorem on dominated convergence.
In the sequel we will denote the -algebra
[TABLE]
Lemma 4.2**.**
Let satisfy Assumption ker and be a solution of (1.2) under Assumption (A) with . Then for any , and ,
[TABLE]
Proof of Lemma 4.2. We use here a limit theorem for -mixingales established in [1]. Indeed, for , , let
[TABLE]
Then, set
[TABLE]
we have:
[TABLE]
Therefore, with defined in (4.16),
[TABLE]
But for any , we have from Assumption (A). Then,
[TABLE]
Thus, using the notations of Definition 2 in [1], it is easy to derive that is a triangular array such that (as ) since and:
[TABLE]
As a consequence,
[TABLE]
implies
[TABLE]
Now, we collect the above relations. Lemma 4.1 and Proposition 2.1 with the regularity of the function , together conclude the proof.
Lemma 4.3**.**
Under the conditions of Theorem 2.1, with defined in (4.30), for any ,
[TABLE]
Proof of Lemma 4.3. Since this is easy to exhibit an increasing sequence with
[TABLE]
Define as the piecewise affine function such that for and . Then the function defined by for satisfies and it is a continuous and non-decreasing function (for almost all , ) and convex function (indeed, for almost all , ). Hence, we have:
[TABLE]
Therefore,
[TABLE]
The construction of and the relation together imply:
[TABLE]
Indeed, this relationship is equivalent to
[TABLE]
But if and , then : therefore since is an increasing function and for any . Moreover, if , there exists and such as . But defined by is a convex function since a.e. As a consequence,
[TABLE]
from the construction of . Since because is a piecewise function, we finally obtain . We conclude with for any and (since ).
Hence the function is a Orlicz function and with
[TABLE]
Now Theorem 1.1 in [16] implies:
[TABLE]
Therefore , and for any since from convexity
[TABLE]
and .
Then, from the definition of and the triangular inequality
[TABLE]
with . Since for any , we finally obtain
[TABLE]
Thus (4.23) implies with the independence of and that:
[TABLE]
Now relation (4.26) with entails
[TABLE]
Thus with we have from (4.26),
[TABLE]
Again using (4.23) and with K_{t}=K\Big{(}\frac{\frac{t}{nT}-u}{b_{n}}\Big{)},
[TABLE]
As a consequence, for any ,
[TABLE]
if is large enough, from Lemma 4.1. As a consequence, for any , since g(\varepsilon\,\sqrt{nb_{n}})\begin{array}[t]{c}\stackrel{{\scriptstyle}}{{\longrightarrow}}\\ {\scriptstyle n\rightarrow+\infty}\end{array}\infty, then \mathbb{E}\Big{(}\sum_{j=1}^{n}\mathbb{E}\big{(}Y_{n,j}^{2}\mbox{\hskip 1.99997ptI1}_{\{|Y_{n,j}|>\varepsilon\}}|{\cal F}^{(s)}_{j-1}\big{)}\Big{)}\begin{array}[t]{c}\stackrel{{\scriptstyle}}{{\longrightarrow}}\\ {\scriptstyle n\rightarrow+\infty}\end{array}0. Since is a non-negative triangular array, the proof of Lemma 4.3 is complete.
Proof of Theorem 2.1. Using (1.2), write
[TABLE]
we decompose it as: , with
[TABLE]
Therefore we obtain:
[TABLE]
with
[TABLE]
We are going to derive the consistency of the estimator of , in two parts.
**1/ **
We first prove that \sqrt{nb_{n}}{M^{(n)}_{s}(u)}\Big{/}{\widehat{D}^{(n)}_{s}(u)}\begin{array}[t]{c}\stackrel{{\scriptstyle{\cal L}}}{{\longrightarrow}}\\ {\scriptstyle n\rightarrow+\infty}\end{array}{\cal N}\big{(}0,C\big{)} for some convenient constant .
Let and . For and , we denote
[TABLE]
This is clear that is a triangular array of martingale increments with respect to the -algebra {\cal F}^{(s)}_{t}=\sigma\big{(}(\xi_{i})_{i\leq s+(t-1)T}\big{)}. Indeed is a process, causal with respect to . This implies that is independent of and that . We are going to use a central limit theorem for triangular arrays of martingale increments, see for example [13] and more recently [15].
Denote
[TABLE]
since . Using Lemma 4.2, we obtain:
[TABLE]
is defined from (4.28) and satisfies
[TABLE]
Moreover, from Lemma 4.3, then for any ,
[TABLE]
As a consequence, the conditions of the central limit theorem for triangular arrays of martingale increments, in [15]), are satisfied and this implies that \displaystyle\frac{\sum_{j=1}^{n}Y_{n,j}}{\sqrt{\sum_{j=1}^{n}\sigma_{n,j}^{2}}}\begin{array}[t]{c}\stackrel{{\scriptstyle{\cal L}}}{{\longrightarrow}}\\ {\scriptstyle n\rightarrow+\infty}\end{array}{\cal N}\big{(}0,1\big{)}.
Therefore from Slutsky lemma entails:
[TABLE]
**2/ **
The second term in the expansion of \sqrt{nb_{n}}\big{(}\widehat{a}_{s}(u)-a_{s}(u)\big{)} depends on the non-martingale term , see (4.29), and the consistent term , see (4.28) and (4.36). The asymptotic behavior of this second term can be first obtained following two steps.
**a. **
A first step consists in establishing an expansion of . Using Proposition 2.1 and with defined in (2.1), we have
[TABLE]
Using twice Lemma 4.1, with firstly , and secondly , we deduce:
[TABLE]
As a consequence, if b_{n}=o\big{(}n^{-1/(1+2\rho)}\big{)}, then \mathbb{E}J_{n}\begin{array}[t]{c}\stackrel{{\scriptstyle}}{{\longrightarrow}}\\ {\scriptstyle n\rightarrow+\infty}\end{array}0.
In the case , we also obtain from (4.12) and with ,
[TABLE]
with .
**b. **
Now we are going to prove a first consistency result for using the Markov Inequality. Indeed,
[TABLE]
Now using Lemma 4.1 with which also belongs in (this is clear if and, for the Lipschitz property of allows to conclude), and , we derive:
[TABLE]
Therefore, if b_{n}=o\big{(}n^{-\frac{1}{1+2(\rho\wedge 1}}\big{)}, then \mathbb{E}J_{n}\begin{array}[t]{c}\stackrel{{\scriptstyle}}{{\longrightarrow}}\\ {\scriptstyle n\rightarrow+\infty}\end{array}0 and \mathbb{E}|J_{n}|\begin{array}[t]{c}\stackrel{{\scriptstyle}}{{\longrightarrow}}\\ {\scriptstyle n\rightarrow+\infty}\end{array}0, implying from Markov Inequality, J_{n}\begin{array}[t]{c}\stackrel{{\scriptstyle{\mathbb{P}}}}{{\longrightarrow}}\\ {\scriptstyle n\rightarrow+\infty}\end{array}0. Finally, since (4.36) establishes the consistency of , from Slutsky lemma, we deduce
[TABLE]
As a consequence, the proof of the Theorem results by using the decomposition (4.27), the consistency results (4.37) and (4.43).
Proof of Theorem 2.2. We restrict to the case .
**a. **
Case \mathbb{E}\big{(}\xi_{0}^{4}\big{)}<\infty.
Denote again \displaystyle K_{t}=K\Big{(}\frac{\frac{t}{nT}-u}{b_{n}}\Big{)}, for . First remark that the symmetry assumption on ’s distribution implies \mathbb{E}\big{(}\xi_{0}\big{)}=\mathbb{E}\big{(}\xi_{0}^{3}\big{)}=0.
[TABLE]
with L_{n,s,\alpha}=\big{\{}(t,t^{\prime})\in I^{2}_{n,s},~{}\,|t-t^{\prime}|\leq\frac{\log n}{\log\alpha}\big{\}}.
Firstly, consider the first left side term of the last inequality. If then Proposition 2.1 entails for an adequate function .
Hence we also have .
Here the fact that is a function in , implies that the function defined from b(v)=\big{(}a_{s}(v)-a_{s}(u)\big{)}^{2} is in too, and again and .
Therefore, we use Lemma 4.1 to derive:
[TABLE]
with \displaystyle g_{j}(x)=\big{(}a_{s}(x)-a_{s}(u)\big{)}^{2}\prod_{i=1}^{j}\big{(}\gamma_{s}^{(4)}(x), since for large enough the above expression satisfies \big{|}{\cal O}\big{(}\frac{\log n}{nb_{n}^{2}}\big{)}\big{|}\leq 1. Using Lemma 4.1, with functions and with (quote that \max_{i\leq j}\big{(}\|g_{i}\|\vee\mbox{Lip}\,(g_{i})\big{)}={\cal O}(j)), we finally obtain:
[TABLE]
Secondly, from Proposition 2.1, for , we have
[TABLE]
Thus,
[TABLE]
from Lemma 4.1. Then, (4.44) and (4.45) provide
[TABLE]
implying \mbox{Var}\,\big{(}J_{n}\big{)}\begin{array}[t]{c}\stackrel{{\scriptstyle}}{{\longrightarrow}}\\ {\scriptstyle n\rightarrow+\infty}\end{array}0 for any such as
[TABLE]
**b. **
Case \mathbb{E}\big{(}|\xi_{0}|^{\beta}\big{)}<\infty, for some .
From its expression given in (4.29), is a quadratic form of and therefore, as is a linear process with innovations , is also a quadratic form of . As a consequence, the fourth order moment can be injected such as there exists a sequence (as ) satisfying:
[TABLE]
Now, assume only that . The innovations can be truncated at level , and write
[TABLE]
Note that the symmetry assumption entails . Define also Define also
[TABLE]
A consequence of (4.47) is:
[TABLE]
with h(M)=\mathbb{E}\big{(}|\xi_{0}|^{2}\mbox{\hskip 1.99997ptI1}_{\{|\xi_{0}|>M\}}\big{)} which satisfies .
Moreover,
[TABLE]
But
[TABLE]
We first remark from Proposition 2.1 that for some constant . Hence, Cauchy-Schwartz Inequality shows that, for each :
[TABLE]
with \delta_{j-1,M}=\mathbb{E}\big{(}|X^{(n)}_{j-1}-X^{(n)}_{j-1,M}|^{2}\big{)}.
We are going to bound . A first simple bound is clearly and we use it together with (4.50), and Cauchy-Schwartz inequality in order to derive
[TABLE]
since .
Now, from (4.51), we obtain for large enough:
[TABLE]
with and always with h(M)=\mathbb{E}\big{(}|\xi_{0}|^{2}\mbox{\hskip 1.99997ptI1}_{\{|\xi_{0}|>M\}}\big{)}. Now a careful use of (4.42) and (4.49) entails:
[TABLE]
since is a function (in the above defined sense). Finally, using Cauchy-Schwartz inequality in (4.48), we obtain for large enough,
[TABLE]
assuming i.e. (and note that ).
Now, if \mathbb{E}\big{(}|\xi_{0}|^{\beta}\big{)}<\infty with , then using Hölder and Markov Inequalities, there exists such as
[TABLE]
Since here b_{n}=o\big{(}n^{-1/(1+2\rho)}\big{)}, does not yields the minimax rates, we deduce that
[TABLE]
Thus, from inequality (4.54), we deduce that the optimal choice is obtained when
[TABLE]
d.
Case .
The expression of the non-central limit for the case of optimal window widths and the expansion of the bias (4.41) now the asymptotic expression for (4.43) yields the proposed non-centred Gaussian limit, see Remark 4.1. The same truncation step as above is also needed.
The proof is now complete.
Remark 4.1**.**
*Using the previous bound (4.38) of and Bienaymé-Tchebychev inequality, we deduce that if b_{n}=o\big{(}n^{-1/(1+2\rho)}\big{)} then J_{n}\begin{array}[t]{c}\stackrel{{\scriptstyle{\mathbb{P}}}}{{\longrightarrow}}\\ {\scriptstyle n\rightarrow+\infty}\end{array}0.
Moreover, if and , using the expansion (4.41) of and again Bienaymé-Chebychev inequality, then J_{n}\begin{array}[t]{c}\stackrel{{\scriptstyle{\mathbb{P}}}}{{\longrightarrow}}\\ {\scriptstyle n\rightarrow+\infty}\end{array}B_{s}(u)\,c^{5/2}.
Therefore with the consistency result (4.36), for any and ,*
[TABLE]
Remark 4.2**.**
For the general case with maybe non symmetric and , the item 3. of Proposition 2.1 needs some improvements. Denote for , then and , then (4.8) turns to be written
[TABLE]
*as previously .
We need to derive suitable equivalents of if . Firstly*
[TABLE]
*and in fact this term is negligible and the proof of Proposition 2.1 and Lemma 3. remains unchanged.
In this case the proof of the above point 2/ c. needs a simple improvement and*
[TABLE]
In this truncated setting, inequality (4.50) writes:
[TABLE]
so that the end of the proof is unchanged by only setting .
Remark 4.3**.**
Secondly, in case we even omit the condition one needs to also express an asymptotic expansion for ; an analogue expansion to Proposition 2.1 and Lemma 3. may thus be derived. Namely \displaystyle w^{(3)}_{t}=\gamma_{s}^{(3)}(\frac{t}{nT})+{\cal O}\big{(}\frac{1}{n}\big{)},, with
[TABLE]
Then the expression of the equivalent of is also adequately transformed up to the above relations.
Aknowledgement.
This work has been developed within the “MME-DII centre of excellence” (ANR-11-LABEX-0023-01) and with the help of PAI- CONICYT MEC Nr. 80170072.
The authors thank the referees for their fruitful comments and suggestions, which notably improved the quality of the paper. The second author wishes to thank Rainer Dahlhaus for many interesting discussions. As well, numerous discussions with Karine Bertin were extremely useful.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Andrews, D. Laws of large numbers for dependent non-identically distributed random variables. Econometric Theory 4 , 3 (1988), 458–467.
- 2[2] Azrak, R. and Mélard, G. Asymptotic properties of quasi-maximum likelihood estimators for ARMA models with time-dependent coefficients. Statistical Inference for Stochastic Processes 9 (2006), 279–330.
- 3[3] Bibi, A. and Francq, C. Consistent and asymptotically normal estimators for cyclically time-dependent linear models. Annals of the Institute of Statistical Mathematics 55 (2003), 41–68.
- 4[4] Dacunha-Castelle, D., Huong Hoang, H. T. and Parey, S. Modeling of air temperatures: preprocessing and trends, reduced stationary process, extremes, simulation. Journal de la Société Française de Statistique 156 , 2 (2015), 138–168.
- 5[5] Dahlhaus, R. On the Kullback-Leibler information divergence of locally stationary processes. Stochastic Processes and Applications 62 (1996), 139–168.
- 6[6] Dahlhaus, R. Fitting time series models to nonstationary processes. Annals of Statistics 25 (1997), 1–37.
- 7[7] Dahlhaus, R. Locally Stationary Processes , vol. 30. Time Series Analysis: Methods and Applications, Elsevier, 2012.
- 8[8] Dahlhaus, R. and Polonik, W. Empirical spectral processes for locally stationary time series. Bernoulli 15 (2009), 1–39.
