Moderate deviations in a class of stable but nearly unstable processes
Fr\'ed\'eric Pro\"ia

TL;DR
This paper develops moderate deviation principles for nearly unstable autoregressive processes, providing insights into the behavior of empirical covariance and OLS estimators as the process approaches instability.
Contribution
It introduces a novel moderate deviation framework for nearly unstable AR processes, including cases with singular asymptotic variance, using truncation and deviation techniques.
Findings
Moderate deviation principle for empirical covariance depending on spectral radius
Moderate deviation for OLS estimator when asymptotic variance is invertible
Deviation results for penalized estimators in singular variance cases
Abstract
We consider a stable but nearly unstable autoregressive process of any order. The bridge between stability and instability is expressed by a time-varying companion matrix with spectral radius satisfying . In that framework, we establish a moderate deviation principle for the empirical covariance only relying on the elements of through and, as a by-product, we establish a moderate deviation principle for the OLS estimator when , the renormalized asymptotic variance of the process, is invertible. Finally, when is singular, we also provide a compromise in the form of a moderate deviation principle for a penalized version of the estimator. Our proofs essentially rely on truncations and deviations of --dependent sequences, with an unbounded rate .
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFinancial Risk and Volatility Modeling · Statistical Methods and Inference · Random Matrices and Applications
Moderate deviations in a class of stable but nearly unstable processes
Frédéric Proïa
Laboratoire angevin de recherche en mathématiques, LAREMA, UMR 6093, CNRS, UNIV Angers, SFR MathSTIC, 2 Bd Lavoisier, 49045 Angers Cedex 01, France.
Abstract.
We consider a stable but nearly unstable autoregressive process of any order. The bridge between stability and instability is expressed by a time-varying companion matrix with spectral radius satisfying . In that framework, we establish a moderate deviation principle for the empirical covariance only relying on the elements of through and, as a by-product, we establish a moderate deviation principle for the OLS estimator when , the renormalized asymptotic variance of the process, is invertible. Finally, when is singular, we also provide a compromise in the form of a moderate deviation principle for a penalized version of the estimator. Our proofs essentially rely on truncations and deviations of –dependent sequences, with an unbounded rate .
Key words and phrases:
Nearly unstable autoregressive process, Moderate deviation principle, OLS estimation, Asymptotic behavior, Unit root.
1. Introduction and Assumptions
Unit root issues have long been crucial in time series econometrics and have therefore focused a great deal of research studies. This sudden demarcation between stability and instability is responsible for many inference problems in linear time series (see Brockwell and Davis [4] for a detailed overview of the linear stochastic processes). The remarkable works of Chan and Wei [7] encompass, in a much more general context, the now well-known fact that the least squares estimator is –consistent with Gaussian behavior when the underlying autoregressive process is stable, whereas it is –consistent with asymmetrical distribution when the process is unstable. This rather abrupt change in the rate of convergence and in the asymptotic distribution certainly motivated the wide range of unit root testing procedures, but it also paved the way for studies based on time-varying coefficients. In a nearly unstable autoregressive process, we do not focus on a parameter satisfying or but, instead, the parameter is considered as a sequence such that and as . This sample size dependent structure allows a continuity between stability and instability. For example, Phillips and Magdalinos [20] treat the case where the coefficient is in a neighborhood of the unit root with . Amongst other results, they prove a central limit theorem for the estimator at the rate , thereby making a bridge between the stable rate and the unstable rate . In the same vein, let us also mention the work of Chan and Wei [6], natural generalizations like the study of Phillips and Lee [19] related to vector autoregressions, or the recent unified theory of Buchmann and Chan [5], focused on nearly unstable autoregressive processes. Our paper is precisely based on the latter topic, in a sense that will be precised in good time.
Given a parametric generating process, the precision of the estimation is usually assessed by its rate of convergence and the deviations can be seen as a natural continuation after a central limit theorem or even a law of iterated logarithm. Roughly speaking, they may be used to estimate the exponential decline of the probability of tail events related to the distance between the estimator and the parameter of interest. We refer to Dembo and Zeitouni [8] regarding the mathematical formalization. Since the 1980s, numerous authors have worked on large and/or moderate deviations in a time series context under many and varied hypotheses. Without claiming to be exhaustive, one can mention the studies of Donsker and Varadhan [10] and Bercu et al. [2] on stationary Gaussian processes and quadratic forms, the paper of Worms [21] on Markov chains and regression models and the one of Bercu [1] on first-order Gaussian stable, unstable and explosive processes. One can also mention the works of Mas and Menneteau [15] on Hilbertian processes, Djellout et al. [9] on non-linear functionals of moving average processes, Wu and Zhao [22] on stationary non-linear processes, Miao and Shen [16] on general autoregressive processes or, more recently, Bitseki Penda et al. [3] on first-order processes with correlated errors. All the references inside may complete this concise list.
In this paper, we investigate the moderate deviations of the estimate in stable but nearly unstable autoregressions. This can be seen as a full generalization of the recent work of Miao, Wang and Yang [17], focused on the univariate case. Our proofs essentially rely on truncations and deviations of –dependent sequences where the rate is unbounded. The main technical contributions are twofold. On the one hand, expressing the nearly instability directly through the sequence of spectral radii of the companion matrix seems, to the best of our knowledge, a new approach having many advantages. For example the authors of the recent paper [5] introduce a perturbation in the Jordan canonical form of the model (see Thm. 2.1) which is a powerful idea to deal with the subject of their study, but somehow unnecessarily complex for ours. On the other hand, from a purely technical point of view, unbounded truncations have already been used to get moderate deviations (see e.g. [18] and [17]), but we will see that the vector case treated here and the specific features of the model cannot be adapted as easily to the existing tools. As a consequence, we need to redevelop a full Gärtner-Ellis reasoning to establish the deviations of our unbounded vector truncations. This quite general strategy might inspire future similar studies.
For a fixed , let the process be given for some and by
[TABLE]
where is a sequence of zero-mean i.i.d. random variables. In an equivalent way, we can consider the vector expression
[TABLE]
where is a –vectorial noise, and
[TABLE]
is the companion matrix of the autoregressive process. If has a finite variance, it is well-known that is a second-order stationary process having the causal form
[TABLE]
when , that is, when the largest modulus of its eigenvalues is less than 1 (see e.g. Thm. 11.3.1 of [4] and the fact that each eigenvalue of is the inverse of a zero of the autoregressive polynomial of the process). Since is an i.i.d. sequence, the process is strictly stationary with mean zero and variance given by
[TABLE]
where, for convenience, we will denote in the whole study
[TABLE]
the matrix with 1 at the top left and 0 elsewhere, and its first column standing for the first vector of the canonical basis of . As a consequence of the causal expression above, the initial vector is not arbitrary and has to share the distribution of the process. This also implies the relation
[TABLE]
As will be largely developped throughout the study, is finite for all but, as increases, . The keystone matrix obtained after a correct standardization of is the renormalized asymptotic variance of the process. Before we start, we define a matrix that will also prove to be crucial to our results,
[TABLE]
We are now going to introduce and comment the hypotheses that will be needed, though not always simultaneously, in the whole paper. Section 2 is devoted to our main results : two statements related to the moderate deviations of the empirical covariance and the OLS estimator, a set of explicit examples and some additional comments and conclusions. Finally, in Section 3 divided into numerous subsections, we will prove all our results, step by step.
Remark*.*
We denote by the Euclidean vector norm and by the spectral matrix norm. Other norms may be used, in which case an appropriated subscript is added. Moreover, we will always denote by the usual inner product of the Euclidean space for any . We write for the Moore-Penrose pseudo-inverse of any matrix , whose definition and properties may be found in Sec. 0 of [12].
1.1. Hypotheses
First of all, we present the hypotheses that we retain.
- (H1)
Gaussian integrability condition. There exists such that
[TABLE]
where represents the zero-mean i.i.d. sequence of variance and fourth-order moment . 2. (H2)
Convergence of the companion matrix. There exists a matrix such that
[TABLE]
with distinct eigenvalues , and the top right element of is non-zero. 3. (H3)
Spectral radius of the companion matrix. For all , . In addition,
[TABLE] 4. (H4)
Renormalization. We have the convergences
[TABLE]
for some matrix norm, where is a non-zero matrix and . 5. (H5)
Moderate deviations. The moderate deviations scale satisfies
[TABLE]
for a small .
1.2. Comments on the hypotheses
First, conceding in (H2) that the limiting matrix has distinct eigenvalues is a matter of simplication of the reasonings. Indeed, turns out to be diagonalizable for a sufficiently large , and, as a companion matrix, it is well-known that the change of basis is done via a Vandermonde matrix having numerous nice properties (more details are given in Section 3.1, and a discussion on the case of multiple eigenvalues is provided in Section 2.3). The top right element of is . So, assuming in (H2) that ensures that the limit process is still of order and that 0 cannot be an eigenvalue of , since . Moreover, note that, in (H4), the invertibility of for all is guaranteed by (H3). Indeed, (see e.g. Lem. 5.6.10 and Cor. 5.6.16 of [13]). In addition, we obviously have, for all ,
[TABLE]
so that we get
[TABLE]
giving a lower bound for . Similarly,
[TABLE]
However, an exact upper bound for these sums may be difficult to reach and may require stringent conditions on the elements of . We refer the reader to Lemma 3.1 where, under (H2) and (H3), some asymptotic upper bounds are established. We also refer to Section 2.2 where the explicit calculations in terms of some examples shall help to understand the rates involved in the hypotheses. Now for a fixed , let
[TABLE]
Clearly, . Hence, according to Prop. 2.3.15 of [11], for all , there exists a constant such that, for all , so that
[TABLE]
Letting tend to infinity, it follows from (H3) and (H4) that
[TABLE]
Finally, it will be established in good time that there is a limiting matrix such that
[TABLE]
where is the matrix norm of (H4).
Remark*.*
To facilitate the reading, we consider from now on that the matrix norm is identified in (H4), and we will only note in what follows.
2. Main results
This section contains two statements that constitute the main results of the paper. The first of them is quite long to establish and will need numerous technical lemmas, but the second one will essentially be deduced as a corollary of the first one. Subsequently, we provide some explicit examples for a better understanding and an easier interpretation of the hypotheses together with some graphics showing the evolution of the processes and the estimation of the autoregressive parameter. At the end of the section, we discuss the case of multiple eigenvalues. But, first, let us recall the definition of the large and moderate deviation principles (see Sec. 1.2 of [8] for more details). In what follows, a speed is considered as a positive sequence increasing to infinity.
Definition*.*
A sequence of random variables on a topological space satisfies a large deviation principle (LDP) with speed and rate if there is a lower semicontinuous mapping such that :
- •
for any closed set ,
[TABLE]
- •
for any open set ,
[TABLE]
In particular, if the infimum of coincides on the interior and the closure of some , then
[TABLE]
Definition*.*
A sequence of random variables on a topological space satisfies a moderate deviation principle (MDP) with speed and rate if there is a speed with such that satisfies a large deviation principle of speed and rate .
2.1. Moderate deviations
We now consider an observable trajectory for some fixed , and use it to provide an estimation of the parameter. It is well-known that the ordinary least squares (OLS) estimator of is given by
[TABLE]
The first result is dedicated to the empirical variance .
Theorem 2.1**.**
Under hypotheses (H1)–(H5), the sequence
[TABLE]
satisfies an LDP with speed and a rate function defined as
[TABLE]
where is explicitely given in (3.18) and comes from (H4).
Proof.
See Section 3.2.5. ∎
Remark*.*
Through vectorization, this MDP is established on in order to avoid any confusion in the notations, but we might work in as well. The associated rate function would only require a slight modification of the proof.
Remark*.*
To be punctilious, we may add a small to the diagonal of to ensure that it is non-sigular for all without disturbing the asymptotic behavior.
When the variance given in (1.11) is invertible, we establish the MDP for the OLS in the theorem that follows. However, when it is not the case, there are some technical complications and, to reach an intermediate result, we need to introduce a penalized version of the OLS. For a small , define
[TABLE]
with possibly if is invertible, in which case it is clearly the standard OLS given above, but necessarily otherwise. Consider also the penalized version of the variance and the corrected parameter
[TABLE]
By construction, is, at worst, non-negative definite and for , turns out to be invertible. The same goes for .
Corollary 2.2**.**
Under hypotheses (H1)–(H5), for all , the sequence
[TABLE]
satisfies an LDP with speed and a rate function defined as
[TABLE]
where the variance is given in (1.11), is the penalized variance given in (2.3) and comes from (H4), respectively. If in addition is invertible, then the sequence
[TABLE]
satisfies an LDP with speed and a rate function defined as
[TABLE]
Proof.
See Section 3.2.6. ∎
To sum up, this result shows that, when is invertible, the OLS satisfies an MDP, and even when is singular, one may reach a compromise by getting an MDP for a penalized estimation. In the same vein, notice also that, in the invertible case,
[TABLE]
Remark*.*
In the stable case where , we simply have and for all . By contraction, the MDP of Corollary 2.2 coincides with the one of Thm. 3 of [21] when is invertible.
2.2. Some explicit examples
Before giving some examples, we can already note that (H5) implies . Thus, necessarily, the convergence cannot occur with an exponential rate, this is the reason why we focus on polynomial rates of the form for some in this section. Accordingly, in all the examples below, (H5) is only possible when . Thus, one cannot expect a sequence of coefficients moving too fast toward instability. The domain of validify of the speed of the MDP will be
[TABLE]
2.2.1. Univariate case with one nearly unit root
Suppose that . Then, (H2) and (H3) imply that and . We also have and (H4) can be expressed like
[TABLE]
A straightforward calculation shows that
[TABLE]
so that we can choose . The standard cases, illustrated on Figure 1, are for the positive unit root and for the negative unit root, with and . The rate function associated with Corollary 2.2 is , which corresponds to Prop. 2.1 of [17]. Indeed, their rate is associated to an LDP with the renormalization whereas our normalization is . By contraction, the asymptotic factor explains the difference.
2.2.2. Bivariate case with one nearly unit root
Suppose now that and with . This situation occurs, for example, when
[TABLE]
whose eigenvalues are and . This is illustrated on Figure 2. For and , (H2) and (H3) are satisfied. The direct calculation gives
[TABLE]
whence we obtain
[TABLE]
so (H4) is satisfied with the 1–norm. The choice is impossible, and we finally find
[TABLE]
2.2.3. Bivariate case with two nearly unit roots
Following the same lines, suppose that and . This situation occurs, for example, when
[TABLE]
whose eigenvalues are and . This is illustrated on Figure 3. For and , (H2) and (H3) are satisfied. The direct calculation gives
[TABLE]
whence we obtain
[TABLE]
Moreover,
[TABLE]
so (H4) is satisfied with the 1–norm. The choice is possible and we finally find
[TABLE]
2.3. Discussion on multiple eigenvalues and conclusion
As we will see in the proof of Lemma 3.1, the distinct eigenvalues assumption (H2) is sufficient to reach our results. However, a less stringent formulation of (H2) could be :
- (H)
Convergence of the companion matrix. There exists a matrix such that
[TABLE]
and the top right element of is non-zero. In addition, there exists a rank such that, for all , is diagonalizable and the change of basis matrix satisfies and .
In general, multiple eigenvalues may not falsify our reasonings, except when the multiplicity concerns the eigenvalues whose modulus tends to 1. Indeed, the coefficients of may grow faster in that case. Consider the simple bivariate example where
[TABLE]
Then, it is not hard to solve this linear difference equation whose characteristic roots are the eigenvalues of . In case of multiplicity, the top left term takes the form of
[TABLE]
and even if and for large enough, it follows that
[TABLE]
That invalidates all our reasonings and, in that case, new approaches are needed to potentially reach the moderate deviations. From our viewpoint, this is the main weakness of the set of hypotheses. As it is already observed in [7], multiple unit roots located at 1 influence the rate of convergence of the OLS. We conjecture that the same phenomenon occurs here and that a larger power should come with in the renormalization.
To sum up, this study is a wide generalization of [17] and, although not complete in virtue of the latter remark, it covers most of the MDP issues for the estimation in the stable but nearly unstable case. Large deviations would undoubtedly be a very useful and challenging study to carry out, naturally extending this one. However, to the best of our knowledge, it is not even entirely treated in the stable time-invariant case , clearly revealing the complexity of the problem. A complicated but stimulating trail for future studies could rely on the exponential, and not only polynomial, neighborhood of the unit root. Along the same lines and even if it is of less practical interest, we might as well focus on the explosive side of the unit roots, where new theoretical developments are necessary.
3. Technical proofs
In all the proofs, denotes a generic positive constant that is not necessarily identical from one line to another. We will frequently use the fact that . For asymptotic equivalences, means that both and whereas stands for .
3.1. Some linear algebra tools
Thereafter, we denote by the (distinct) eigenvalues of and those of , in descending order of modulus. We start by establishing two lemmas that will prove to be very useful in what follows.
Lemma 3.1**.**
Under hypotheses (H2) and (H3), as tends to infinity,
[TABLE]
Proof.
The lower bounds are established in Section 1.2, in (1.8) and (1.9) precisely. For the upper bounds, fix
[TABLE]
According to Thm. 2.4.9.2 of [13], (H2) implies the existence of a rank such that, for all , the eigenvalues of satisfy
[TABLE]
and
[TABLE]
Let be a change of basis matrix in the diagonalization of . Then, since is a companion matrix, a standard choice would be
[TABLE]
This Vandermonde matrix is invertible if and only if for all (see e.g. Sec. 0.9.11 of [13]). In that case, is closely related to the Lagrange interpolating polynomials given, for , by
[TABLE]
Precisely, the –th row of contains the coefficients of in the basis of , i.e.
[TABLE]
where the relation enables to identify each . Combining (3.1) and (3.2), it follows that, for all ,
[TABLE]
We also have since and since is a finite combination of sums and products of . To sum up, for all and ,
[TABLE]
Consequently,
[TABLE]
It only remains to sum over and to let tend to infinity to reach the first result. Similarly,
[TABLE]
so we get the second result by following the same lines. ∎
Lemma 3.2**.**
Under hypotheses (H2) and (H3), we have the convergence
[TABLE]
for any rate satisfying .
Proof.
Consider the rank introduced in the proof of Lemma 3.1. Then, according to the inequality (3.5),
[TABLE]
where the invertible and uniformly bounded matrices and are given in (3.3) and (3.4), respectively. We also have
[TABLE]
from the hypothesis on . It remains to let tend to infinity in the above inequality. ∎
3.2. Proofs of the main results
First of all, it is convenient to express the empirical variance of the process as
[TABLE]
where the variance is given in (1.4),
[TABLE]
and the residual term is
[TABLE]
Then, solving this generalized Sylvester equation (Lem. 2.1 of [14]) and considering the invertibility of in (1.7) which is proved at the beggining of Section 1.2, we reach the decomposition
[TABLE]
Let us now reason step by step, via some intermediate results.
3.2.1. Exponential moments of the squared initial value
We recall that, from the causal form (1.3) of the process,
[TABLE]
The following result gives an exponential moment for the correctly renormalized squared initial value.
Lemma 3.3**.**
Under hypothesis (H1),
[TABLE]
where is given in (1.8).
Proof.
By Cauchy-Schwarz inequality,
[TABLE]
Moreover, from Jensen’s inequality, for all ,
[TABLE]
using . Taking the expectation and choosing given in (H1), we deduce that
[TABLE]
∎
3.2.2. Exponential convergence of the residual term
The residual term in the decomposition (3.9) is given by
[TABLE]
Our next objective is to prove the exponential negligibility of this residual.
Lemma 3.4**.**
Under hypotheses (H1)–(H5), for all ,
[TABLE]
Proof.
First, note that
[TABLE]
Thus,
[TABLE]
where is given in (1.8), using Markov’s inequality, the reasoning in the proof of Lemma 3.3 and the fact that, from the strict stationarity of the process, and share the same distribution. Hence, for a sufficiently large ,
[TABLE]
since from (H4), from Lemma 3.1 and since, from (H2), converges. Finally, letting tend to infinity, (H1) and (H5) conclude the proof. ∎
3.2.3. The truncated sequence
In what follows, we define the rate
[TABLE]
and we note from (H3)–(H5) that
[TABLE]
Following the idea of [17], we are going to use as a truncation parameter. Consider
[TABLE]
as an approximation of in its causal form (1.3). We also define the truncated version of the summands in (3.2) as
[TABLE]
The process is strictly stationary and –dependent, according to Def. 6.4.3 of [4]. Let us study some properties of this process.
Lemma 3.5**.**
Under hypotheses (H1)–(H4), we can find a constant such that, for a sufficiently large ,
[TABLE]
for any rate satisfying .
Proof.
By Hölder’s inequality,
[TABLE]
Moreover, for the rank and the uniformly bounded matrices and introduced in the proof of Lemma 3.1,
[TABLE]
as soon as . Thus,
[TABLE]
Finally, (H4), (1.10) and (3.7) lead, for large values of , to
[TABLE]
It remains to choose . ∎
Lemma 3.6**.**
Under hypotheses (H2)–(H4), for all and ,
[TABLE]
where the covariance can be explicitely built in terms of , and . In addition,
[TABLE]
where the non-zero limiting matrix is given in (3.18).
Proof.
We will use in what follows and defined in (1.5). Let be the –algebra of the events occurring up to time . Then, it is easy to see that
[TABLE]
in virtue of (1.6). For , by direct calculation,
[TABLE]
and the same is true for since . Now for , a tedious but straightforward calculation leads to
[TABLE]
To give an explicit expression of , it suffices to observe that the truncated expression (3.14) has a variance given by
[TABLE]
so that
[TABLE]
Let us now look at the asymptotic behavior of correctly renormalized. First, we have the convergence
[TABLE]
coming from the identity and Lemma 3.2. Together with (H4), this implies
[TABLE]
In the end of the proof, we call the vectorization inverse operator (namely, in our context, the reconstruction of a matrix from its vectorization of size ). Then,
[TABLE]
Combining (3.16) with (3.17) and (H4), we have
[TABLE]
where . ∎
Remark*.*
As a by-product, we also obtain, following the same lines,
[TABLE]
where is given in (1.4), which proves (1.11). The variance defined above may be seen as the truncated version of .
3.2.4. The remainder of the truncation
We denote by
[TABLE]
the remainder of the truncation of in (3.2) made via (3.15). Our last preliminary objective is to establish the following lemma.
Lemma 3.7**.**
Under hypotheses (H1)–(H5), for all ,
[TABLE]
Proof.
Clearly, both terms in the definition of (3.19) are similar and we will only work on the first one. From the causal expression (1.3) and the truncation (3.14), we note that
[TABLE]
Thus, with given in (1.9) and applying Lem. 17 of [15] under (H1),
[TABLE]
for some and , where
[TABLE]
Our choice of in (1.9), the properties of Lemma 3.1, (3.6) and our hypotheses on the rates of convergence lead, for large enough, to
[TABLE]
and obviously . Hence, like in formula (3.11) of [17], there are some constants and such that, for all and large values of ,
[TABLE]
Going back to (3.2.4),
[TABLE]
where, for convenience, we note
[TABLE]
To sum up,
[TABLE]
This is clearly sufficient to finish the proof since, from (H4),
[TABLE]
for large enough. ∎
We are now ready to prove Theorem 2.1 and Corollary 2.2.
3.2.5. Proof of Theorem 2.1
All the technical results of the previous sections are now going to be concretely used. Consider the sequence
[TABLE]
where is given in (3.15). The process is also strictly stationary and –dependent. Like in [18] or [17, suppl. mat.], let us extract an independent sequence from this process. For , define
[TABLE]
where and where and its properties are given in (3.12). Then, is strictly stationary and –dependent. Next, for , define
[TABLE]
where and is another rate satisfying
[TABLE]
To be convinced that such a rate exists, one can use (3.13) and the fact that and when . The process is now i.i.d. and the rates satisfy
[TABLE]
The reasoning of [17, suppl. mat.] does not suit us, so we need to reformulate the establishment of the MDP. First, by a Taylor-Lagrange expansion,
[TABLE]
in which the remainder term satisfies, for any ,
[TABLE]
Now, the random variables sharing the same distribution for all , it follows from Hölder’s inequality that,
[TABLE]
for large enough, using Lemma 3.5 with stemming from (3.13), the convergence of , (H1) and treating all the terms of (3.15) similarly. Taking the expectation in (3.24) and exploiting the independence of the zero-mean process , we obtain the decomposition
[TABLE]
for we can see, as it is done in [18], that the residual term
[TABLE]
plays a negligible role in comparison to the main one. To eliminate the third-order term, we first look at the fourth-order moment of , that is
[TABLE]
A long but standard calculation shows that
[TABLE]
as tends to infinity. This result is reached using the strict stationarity of the process, the explicit expression of in terms of , the inequality (3.6) and, finally, using (H4) giving the equivalence between and . So,
[TABLE]
By Lyapunov’s inequality,
[TABLE]
for a small . Now, combining this result with (3.25) and Hölder’s inequality, for sufficiently large values of ,
[TABLE]
by (3.25), (3.23) and the properties in (3.22). The second-order term in (3.26) satisfies
[TABLE]
where we used (3.23) and the results of Lemma 3.6. The combination of (3.26), (3.27) and (3.28) together with the Gärtner-Ellis theorem (see e.g. Sec. 2.3 of [8]) shows that the sequence
[TABLE]
satisfies an LDP with speed and rate function given by the Fenchel-Legendre transform of the above logarithmic moment generating function, i.e.
[TABLE]
Note that, due to its particular structure, is only non-negative definite as soon as (by way of example, its last row and column are zero). In that case (see e.g. Ex. 1.1.4 of [12], page 212), the explicit expression of this quadratic rate function, strictly convex on its relative interior, is
[TABLE]
After the truncation introduced in (3.14), the decomposition (3.9) can be rewritten as
[TABLE]
where, in the remainder term , the residual of the truncation is given in (3.19) and the main residual is given in (3.11). Lemma 3.4 and Lemma 3.7 show that the first term in the right-hand is an exponentially good approximation of the left-hand side and that, as a consequence, they share the same LDP (see Def. 4.2.10 and Thm. 4.2.13 of [8]). The contraction principle (see Thm. 4.2.1 of [8]) enables to compute the rate function associated with the LDP, namely
[TABLE]
where the limiting value comes from (H4). ∎
3.2.6. Proof of Corollary 2.2
[TABLE]
Our objective is first to prove that, for all ,
[TABLE]
where is the invertible penalized variance (2.3), and then to establish an LDP for the sequence
[TABLE]
in order to obtain the announced result, via the contraction principle (Thm. 4.2.1 of [8]). On the one hand, we know from Theorem 2.1 and (3.29) that
[TABLE]
[TABLE]
and . So,
[TABLE]
It is also clear that
[TABLE]
and (1.11) shows that the second event in the right-hand side becomes impossible when increases. Hence, from the reasoning above,
[TABLE]
Now we shall use Lem. 2 of [21] to get (3.30).
On the other hand, all the work consisting in proving that the sequence (3.31) satisfies an LDP with speed has already been done in the proof of Theorem 2.1. Indeed, via the truncation (3.14),
[TABLE]
where the process forms a strictly stationary and –dependent sequence. However, apart from the renormalization, this is precisely the first column of the first term of (3.15). Thus, the calculations are similar and we find, like in Lemma 3.6,
[TABLE]
In that case, from the convergence (3.17) and the previous proof, the rate function associated with the LDP is given by
[TABLE]
The exponential negligibility of the remainder of the truncation is obtained by following the lines of Lemma 3.7. The contraction principle enables to compute the rate function associated with the LDP, namely
[TABLE]
where the exponential convergence (3.30) has been combined to the LDP established on the sequence (3.31). ∎
Acknowledgements. The author thanks the associate editor and the two anonymous reviewers for the numerous comments and suggestions that clearly helped to improve the paper. He also thanks R. Garbit for the constructive discussion about the link between Vandermonde matrices and Lagrange polynomials.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Bercu, B. On large deviations in the Gaussian autoregressive process: stable, unstable and explosive cases. Bernoulli. 7 (2001), 299–316.
- 2[2] Bercu, B., Gamboa, F., and Rouault, A. Large deviations for quadratic forms of stationary Gaussian processes. Stoch. Proc. Appl. 71 (1997), 75–90.
- 3[3] Bitseki Penda, V., Djellout, H., and Proïa, F. Moderate deviations for the Durbin-Watson statistic related to the first-order autoregressive process. ESAIM Probab. Stat. 18 (2014), 308–331.
- 4[4] Brockwell, P. J., and Davis, R. A. Time series: Theory and Methods (Second Edition) . Springer Series in Statistics. Springer, New York, 1991.
- 5[5] Buchmann, B., and Chan, N. H. Unified asymptotic theory for nearly unstable AR ( p ) 𝑝 (p) processes. Stoch. Proc. Appl. 123 (2013), 952–985.
- 6[6] Chan, N. H., and Wei, C. Z. Asymptotic inference for nearly nonstationary AR ( 1 ) 1 (1) processes. Ann. Stat. 15 (1987), 1050–1063.
- 7[7] Chan, N. H., and Wei, C. Z. Limiting distributions of least squares estimates of unstable autoregressive processes. Ann. Statist. 16 (1988), 367–401.
- 8[8] Dembo, A., and Zeitouni, O. Large Deviations Techniques and Applications (Second Edition) , vol. 38 of Applications of Mathematics . Springer, 1998.
