Necessary and sufficient conditions for the identifiability of observation-driven models
Fran\c{c}ois Roueff (IDS, S2A), Randal Douc (TIPIC-SAMOVAR, CITI), Ois, Roueff, Tepmony Sim (ITC)

TL;DR
This paper establishes necessary and sufficient conditions for the identifiability of observation-driven models, including GARCH and integer-valued time series models, ensuring the consistency of estimators.
Contribution
It extends the identifiability conditions from GARCH models to a broader class called linearly observation-driven models, covering various standard time series models.
Findings
Identifiability conditions are established for a broad class of models.
Conditions ensure the consistency of quasi-maximum likelihood estimators.
Includes standard models like Poisson GARCH and NBIN-GARCH.
Abstract
In this contribution we are interested in proving that a given observation-driven model is identifiable. In the case of a GARCH(p, q) model, a simple sufficient condition has been established in [1] for showing the consistency of the quasi-maximum likelihood estimator. It turns out that this condition applies for a much larger class of observation-driven models, that we call the class of linearly observation-driven models. This class includes standard integer valued observation-driven time series, such as the log-linear Poisson GARCH or the NBIN-GARCH models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFinancial Risk and Volatility Modeling · Statistical Methods and Inference · Monetary Policy and Economic Impact
\newaliascnt
propositiontheorem \aliascntresettheproposition \newaliascntlemmatheorem \aliascntresetthelemma \newaliascntcorollarytheorem \aliascntresetthecorollary
\newaliascntdefinitiontheorem \aliascntresetthedefinition
Necessary and sufficient conditions for the identifiability of
observation-driven models
Randal Douc
,
François Roueff
and
Tepmony Sim
Département CITI
CNRS UMR 5157
Télécom SudParis
91000 Évry
France
LTCI
Télécom Paris
Institut Polytechnique de Paris
19 place Marguerite Perey,
91120 Palaiseau
France
Department of Foundation Year
Institute of Technology of Cambodia
12156 Phnom Penh
Cambodia
Abstract.
In this contribution we are interested in proving that a given observation-driven model is identifiable. In the case of a GARCH model, a simple sufficient condition has been established in [2] for showing the consistency of the quasi-maximum likelihood estimator. It turns out that this condition applies for a much larger class of observation-driven models, that we call the class of linearly observation-driven models. This class includes standard integer valued observation-driven time series such as the Poisson autoregression model and its numerous extensions. Our results also apply to vector-valued time series such as the bivariate integer valued GARCH model, to non-linear models such as the threshold Poisson autoregression or to observation-driven models with exogenous covariates such as the PARX model.
Key words and phrases:
identifiability, observation-driven models, time series of counts
2000 Mathematics Subject Classification:
Primary: 60J05, 62F12; Secondary: 62M05,62M10.
1. Introduction
Observation-driven models (ODM) were introduced in [8] and have received considerable attention since. They are commonly used for modeling various non-linear times series in applications ranging from economics (see [23]), environmental study (see [3]), epidemiology and public health study (see [29, 10, 12]), finance (see [20, 24, 13, 16]) and population dynamics (see [19]). Additional covariates have been added to some of these models leading to GARCHX type models, see [1] for recent examples in the context of count data, and the references therein. We include such a case in our setting leading to the general observation-driven models with exogenous variables (ODMX).
As often for non-linear time series the question of identifiability of the observation-driven models is a delicate one and is often appearing as an assumption used for proving the consistency (say) of the maximum likelihood estimator. A noticeable exception is the GARCH model, for which an explicit sufficient condition appears in [2], see their condition (2.27). We will in fact prove that this condition is not only sufficient but also necessary for the identifiability, and that this result extends to a much larger class of observation-driven models than the GARCH() model. See Theorem 2 below and the comments following this result.
We provide general conditions to ensure that an ODM or an ODMX defined through a collection of parameterized iterative schemes uniquely describes the law of the observations. In other words our conditions ensure that two different iterative schemes within the same model cannot produce the same law for the observations. Then a given parameter is identifiable if two different values of the parameter are not compatible with the same iterative scheme. Let us stress, however, that we do not consider the misspecified case here, that is, we always assume that the observations indeed follow the (unique) stationary distribution corresponding to (at least) one given parameter of the model. Our setting is nevertheless of interest for the misspecified setting since a non-identifiable parameter (in the well specified case) cannot be identified in the misspecified case. Hence the necessity of our conditions remains true for the misspecifed setting.
A special class of ODMs, that we call linearly observation driven models (LODMs) below, arises when the hidden variable is obtained linearly from hidden or observed variables of the past, and when all these variables are univariate, as for the GARCH() model. This latter model was extensively studied, see for example [5, 14, 15, 21, 16] and the references therein. Many other examples, linear or non-linear, univariate or multivariate, have been derived from this class, see [4] for a long list of them, although this list have been lengthened quite significantly since, in particular because of the recent adding of various integer valued ODMs to deal with count time series (see [7, 25] and the references therein). Our goal is to derive necessary and sufficient conditions potentially applying to a wide variety of ergodic observation driven models. To illustrate the generality of our results, we apply them to a list of various examples which includes, in addition to the standard GARCH model, the nonlinear GARCH model of [17], the INGARCH model of [12], the Log-linear Poisson GARCH of [13], the MPINGARCH model of [25], the PARX model of [1], the Bi-variate integer GARCH model of [9] and the self-excited threshold Poisson Autoregression of [28]. We are able to derive necessary and sufficient conditions for identifiability for all the considered examples.
The rest of the paper is organized as follows. Section 2 contains additional notation and definitions that will be used throughout the paper. Section 3 contains a list of examples already considered in the literature. Our main results can be found in Section 4, some proofs of which are postponed to Section 6. Before that, in Section 5, we show how our results apply to the examples of Section 3 or can be extended to larger classes of models.
2. Preliminaries
2.1. Formal definitions of observation driven models
Let us now formally introduce the class of observation-driven models and important sub-classes. Throughout the paper we use the notation for , with the convention that is the empty sequence if , so that, for instance . The observation-driven time series model can formally be defined as follows.
Definition \thedefinition (ODM, ODMX).
Let , and be measurable spaces, respectively called the latent space, the observation space and the admissible observation space. Let be a compact metric space, called the parameter space. Let be a measurable function from to . Let be a family of measurable functions from to , called the reduced link functions and let be a family of probability kernels on , called the observation kernels. A time series valued in is said to be distributed according to an observation-driven model of order (hereafter, ODM) with reduced link function , admissible mapping and observation kernel if there exists a process on such that for all ,
[TABLE]
where and for all .
In the presence of exogenous variables defined as an -Markov chain valued in the space with kernel , the admissible mapping is defined from to and the iterative equation (2.1) is replaced by, for all ,
[TABLE]
where, in this case, and for all . We then say that the time series valued in is distributed according to an observation-driven model of order with -order Markov exogenous variables (hereafter, ODMX) with reduced link function , admissible mapping , observation kernel , and exogenous Markov kernel .
The variables are called the observed variables, the variables the hidden variables and the variables the admissible variables. In addition, we define the augmented variables
[TABLE]
which take values in the augmented space
[TABLE]
Remark 1**.**
Let us briefly comment on the unusual notion of admissible mapping which allows us to define the admissible variables of an ODM() in Section 2.1:
- (1)
For all , the conditional distribution of given only depends on defined by (2.3). 2. (2)
The time series is also an ODM with admissible mapping being the identity, link function and observation kernel on the observation space . 3. (3)
On the other hand, we can also set the admissible mapping to be the identity for the ODM , in which case and the reduced link function should be replaced by the link function defined all by
[TABLE]
In fact the advantage of using an admissible mapping is precisely to obtain a reduced link function , more convenient than the (non-reduced) link function . We will focus in the particular case where is linear, into which we can cast not all but many observation driven models, see Section 3 hereafter. 4. (4)
An ODMX() can be cast into an ODM() by defining and and observing that the obtained times series is an ODM() with hidden variables . However for treating identifiability as is the purpose here, it is more convenient to keep distinguishing between the ODM and the ODMX setting. 5. (5)
In the following the variables and will be used extensively as they simplify a lot the presentation and the reasoning. It is important to note that the definitions of , and are not the same in the ODM and the ODMX settings as they involve in the later case. In particular the conditional distribution of given takes two very different forms in the ODM and ODMX cases. They can be respectively expressed by and where is a probability kernel on and , resp. For conciseness we use the same notation for the two cases. They are resp. defined by setting, for all , and ,
[TABLE]
When the reduced link function is linear we specify Section 2.1 into the following.
Definition \thedefinition ((V)LODM(X)).
We say that an ODM() (resp. ODMX()) is a vector linearly observation-driven model of order , shortened as VLODM, (resp. VLODMX) if for some , and are closed subsets of and , respectively, and, for all , , and ,
[TABLE]
for some mappings , and defined on and valued in , and . In the case where , the VLODM (resp. VLODMX) is simply called a linearly observation-driven model of order , shortened as LODM (resp. LODMX).
2.2. Iterations of the link function
We now introduce iterated versions of the reduced link function . Let be defined by (2.4). We define for any and , the mapping through a set of recursive equations of order . Namely, for all , and , we define
[TABLE]
where the sequence is defined by
[TABLE]
In this set of equations the last line is applied recursively so that in fact, for all , only depends on and .
The equations in (2.10) define a system with input sequence , initial condition and output sequence . Because the recursion given by the last line of (2.10) involves successive entries of the output and successive entries of the input, it is useful to define blocks, valued in and consider the same recursion applying to such blocks, hence computing from and . Formally, for all , we define by
[TABLE]
Remark 2**.**
Note in particular that with this notation at hand, and using the admissible variables for the ODM case or for the ODMX case, and defined by (2.3), the second line of (2.1) and the third line of (2.2) are equivalent to
[TABLE]
We further denote the successive composition of , , …, and by
[TABLE]
This recursion is the same as the one for defining , except that it is valued in , where as is valued in . More precisely, denoting, throughout the paper, for all , by the -th entry of , we have the following relations between and , for all and ,
[TABLE]
where, in the second line, we set for and use the convention for .
2.3. Ergodic assumption and some interesting class of parameters
In this contribution, we only consider the case where all processes in the model are ergodic. Namely, we use the following assumption.
- (A-1)
For all , there exists a unique stationary solution satisfying (2.1).
In the case of exogenous covariates this assumption is replaced by the following.
- (A’-1)
For all , there exists a unique stationary solution satisfying (2.2).
This ergodic property is the cornerstone for making statistical inference theory work and we provide simple general conditions in [11] for and in [26, 27, Chapter 5] for the case of general order .
We now introduce the notation that will allow us to refer to the stationary distribution of the model throughout the paper.
**Definition \thedefinition **(Stationary distributions and
).
We define the distributions and as follows.
- a)
Under (A-1), denotes the distribution on of the stationary solution of (2.1); Under (A’-1), denotes the distribution on of the stationary solution of (2.2). 2. b)
Under (A-1), denotes the projection of on the component ; Under (A’-1), denotes the projection of on the component .
We also use the symbols and to denote the expectations corresponding to and , respectively.
To study the identifiability of ergodic ODM’s, we introduce equivalent classes that define a partition of the parameter set in subsets of parameters which share the same distribution of observations. Formally, it reads as follows.
**Definition \thedefinition **(Equivalent classes for
).
Suppose that (A-1) or (A’-1) holds and define as in Section 2.3. For all , we write if and only if . This defines an equivalence relation on the parameter set and, for any , the equivalence class of is denoted by .
Remark 3**.**
In the context of exogenous variables, that is, under (A’-1), since the distribution of under does not depend on , is equivalent to say that the conditional distribution of given is the same under and under .
Determining the equivalent classes for all amounts to solve the identifiability of a parameter under the assumption of a well specified model. Namely, assuming that the distribution of the observations is given by for some (unknown) parameter , a parameter is identifiable if and only if the given mapping is constant over the equivalent class . Without identifiability, the consistency of any estimator of is not possible. A special case is when reduces to the singleton , so that every parameter is identifiable, in which case the model is said to be identifiable. Obviously, if and share the same iterative equation (2.1) (or (2.2 with exogenous covariates), that is, if and for all , by uniqueness of the stationary distribution, they must share the same one and in particular we get . Thus, using the more convenient notation introduced in (2.11), we have
[TABLE]
We will provide general conditions ensuring that this inclusion becomes an equality, see Section 4.1 below. However it may happen in standard situations that this inclusion is strict, as will be seen in Remark 8(5). Nevertheless, in all the considered examples, it will be possible to recover an equality by replacing by a more appropriate subset in the left-hand side of (2.16).
As often for ODMs, our results rely on the assumption that, under , the hidden variables are measurable with respect to the admissible variables from the past. This is not completely surprising since, using the notation introduced in Section 2.2, iterating the link function, we have that, for all and all in ,
[TABLE]
In particular, taking and letting decrease backward towards , we get that, is measurable with respect to , where and respectively denote the natural filtrations of and . To our knowledge, all ODM of interest satisfy in fact the stronger property that is measurable with respect to , which is sometimes called the invertibility condition. This condition is now introduced with some notation for expressing as a measurable function of .
- (A-2)
For all , the measurable function satisfies
[TABLE]
Since is stationary, (2.17) also implies that, for all , For an ODM (resp. an ODMX), we have (resp. ). Thus Assumption (A-2) allows us to derive the ’s from the ’s (resp. from the ’s and ’s) and therefore to rewrite the relationship given through the link function in the second line of (2.1) (resp. in the third line of (2.2)) between these variables in terms of a recursive relationship involving only the ’s (resp. the ’s and the ’s). It turns out that Condition (2.17) in (A-2) can be verified using only, that is, we do not need but only its marginal onto the variable ’s (resp. the variables ’s and the ’s), as shown by the following result.
Lemma \thelemma.
Consider an ODM satisfying (A-1) with or an ODMX satisfying (A’-1) with . Let and consider a measurable function . Then (2.17) is satisfied if and only if the two following equations hold.
[TABLE]
Proof.
Suppose that (2.17) holds true. Since is shift invariant, it can be extended to all time instants , namely,
[TABLE]
But then (2.18) and (2.19) follows from the model equations (2.1) in the case of an ODM or (2.2) in the case of an ODMX.
Suppose now that (2.18) and (2.19) hold true. Since is shift invariant, they are extended to all time instants in the form
[TABLE]
Defining for all , we see that is a stationary sequence satisfying the model equations (2.1) in the ODM case and is a stationary sequence satisfying the model equations (2.2) in the ODMX case. By uniqueness of assumed in (A-1) and (A’-1), respectively, we get that (2.17) holds. ∎
Now, given , we introduce the set of all parameters whose recursive relationship (2.18) apply to almost all trajectories of under the distribution of .
**Definition \thedefinition **(Subset
).
Suppose that we are given a measurable function . Then, for all , we denote by the set of all parameters satisfying the two following equations
[TABLE]
It is important to note that of Section 2.3 depends on the choice of the class of functions and that Assumption (A-2) alone is not sufficient to define each on the whole set of trajectories, since Relation (2.17) is only required to hold . We now provide some Lipschitz condition on the iterates of the link function and a moment condition on that allow us to build a natural class of functions that satisfies (A-2). Whenever we need some metric on the space , we assume the following.
- (A-3)
The -fields and are Borel ones, respectively associated to and , both assumed to be complete and separable metric spaces.
Recall that, for any finite -valued sequence , the mapping is defined by (2.9) following the recursion in (2.10). Define, for all , the Lipschitz constant for , uniform over ,
[TABLE]
where we set, for all and ,
[TABLE]
We use the following assumptions to define the class of functions .
- (A-4)
For all , we have and as . 2. (A-5)
There exists and, if , such that the constant vectors and satisfy, for all ,
[TABLE]
where we defined, for all ,
[TABLE]
with the convention if . 3. (A-6)
For all and , the reduced link function is continuous on .
Obviously, under (A-4), for all and , the asymptotic behavior of as does not depend on . We can thus denote
[TABLE]
and keep in mind that the initial point has no influence on these two definitions.
By (2.15), we further have the following result using the definitions in (2.25).
[TABLE]
and and are related for all through the formulas
[TABLE]
Based on these definitions, we now introduce subsets of of particular interest.
Definition \thedefinition (Set ).
If Assumption (A-4) holds, we set, for any ,
[TABLE]
where and are defined by (2.26).
Remark 4**.**
Suppose that, for all , we have , , and suppose that (A-2) holds for as in (2.25). Then, by (2.3) and (2.28), we have , Since is shift-invariant, we get that takes its values in , This is why the set will be of interest in the following.
The following result is proved in Section 6.1.
Lemma \thelemma.
Consider an ODM satisfying (A-1) with or an ODMX satisfying (A’-1) with . Suppose that (A-3), (A-4) and (A-5) hold. Then, for all , we have , , (A-2) holds and, setting as in Section 2.3, we have
[TABLE]
If moreover (A-6) is assumed, then (2.21) holds for all . Consequently, the set in Section 2.3 can be expressed as
[TABLE]
Remark 5**.**
The invertibility Assumption (A-2) is essential for deriving the identifiability class using the set . Section 2.3 can be used to prove it in all the examples that are considered hereafter. Indeed as will be checked in Section 5, all the considered examples satify the following facts:
- (1)
The sets and are closed subsets of finite dimensional normed spaces and (A-3) follows. 2. (2)
Assumption (A-4) is weaker than what is needed for proving the ergodicity assumption (A-1). Consider for instance the classical GARCH(1,1) model defined by setting , and where is centered with variance 1. Then it is easily seen that (A-4) is equivalent to . On the other hand, the Lyapunov condition to get (A-1) reads , which implies . 3. (3)
The moment condition (A-5) is implied by , where is some norm, and this condition holds as a byproduct of the proof of (A-1) (which often imply for some ). 4. (4)
One can readily checks (A-6).
Note also that the set in the left-hand side of (2.30) contains the set in the left-hand side of (2.16). In all our examples, the assumptions of 4.1 below will be shown to hold, implying that the inclusion in (2.30) is in fact an equality. In some of these examples, however, the inclusion in (2.16) is strict, showing that the sets in the left-hand sides of (2.16) and (2.30) may happen to be different.
3. Examples
We give a non-exhaustive list of possible examples related to the previous definitions and for which our results apply, as will be shown in Section 5.
3.1. Standard LODMs
Many models can be considered as an LODM by choosing an appropriate admissible mapping .
GARCH. The standard GARCH model is a special case of LODM, in which case , , , and is a centered distribution with variance , most commonly the normal distribution.
INGARCH. The standard Poisson integer-valued GARCH (INGARCH, see e.g. [12]) obviously is an LODM with , and is the Poisson distribution with mean .
Extensions of INGARCH. Many extensions of the INGARCH model simply consist in extending the Poisson distribution to more general ones: the NBIN-GARCH model of [30], the COM-Poisson INGARCH model of [31], the zero-inflated Poisson GARCH of [32], or the mixed-Poisson integer GARCH (MPINGARCH) of [25], among others. Often for these extensions, an extra-parameter is used to define the distribution , in which case this extra parameter can be taken either as known, in which case does not depend on , or as unknown, in which case only depends on a subparamater of . Some integer valued observation driven models require using a non-identity admissible mapping in order to be seen as an LODM. For instance, the log-linear Poisson Garch model of [13] is an LODM by taking , and as the Poisson distribution with mean .
All the above examples are LODMs with a similar parametrization of the linear link function. In fact they only differ through the admissible mapping or the observation kernel . We assemble them using the following definition.
Definition \thedefinition (Standard LODM (with unknown observation kernel)).
An LODM() of Section 2.1 is said to be standard if with , for all and for all , and does not depend on , in which case we denote it by . It is said to be standard with unknown observation kernel if the same holds with where and is some parameter set, and only depends on , in which case we denote it by .
In this definition the parameter is used in the case where the observation kernel depends on an unknown extra parameter, as considered in [25] for the class of MPINGARCH() models which include the NBIN GARCH model. A necessary and sufficient condition for standard LODMs with known or unknown observation kernel is provided in Theorem 2 below and applies to all the examples listed in this section.
3.2. A bivariate example
Let us extend Section 3.1 to the vector case as follows.
Definition \thedefinition (Standard VLODM (with unknown observation kernel)).
A VLODM() is said to be standard if with , for all and for all , and does not depend on , in which case we denote it by . It is said to be standard with unknown observation kernel if the same holds with where and is some parameter set, and only depends on , in which case we denote it by .
Then the bivariate integer valued GARCH model of [9] is a standard VLODM() with unknown observation kernel defined for all (where is some constant), and , by
[TABLE]
where . Since we have in this example, we simply denote .
3.3. Non-linear GARCH
The non-linear GARCH model of [17] is an ODM() with
[TABLE]
where is a real valued random variable. Two cases are considered in [17] :
- Case 1)
If the exponent is known, we set with for , and , in which case we have a standard VLODM of Section 3.2 with known observation kernel and with the parameters denoted by for and the parameters denoted by for . 2. Case 2)
If the exponent is unknown, we set and , in which case must be included in the definition of .
To our best knowledge this kind of model have not be extended to the case of (signed) integer valued time series.
3.4. The SETPAR model
Other non-linear ODM’s that cannot be cast into an LODM or a VLODM can be found in [6]. We consider here the self-excited threshold Poisson autoregression (SETPAR) model originally studied in [28], which is an ODM(1,1), integer valued (), with link function defined for all by
[TABLE]
with being the usual Poisson distribution with mean .
3.5. The PARX model
Our last example is the Poisson autoregression with exogenous covariates (PARX) model of [1]. The PARX model is similar to the standard INGARCH() model above but with additional exogenous variables entering into the link function for generating the hidden variables. The exogenous variables are assumed to satisfy some Markov dynamic of order 1 (see [1, Assumption 1]). Thus it is an ODMX(). Eq. (1) in [1] corresponds to setting our as the Poisson distribution with mean . Eq. (2) in [1] corresponds to setting for all and ,
[TABLE]
where is a known function and is the unknown parameter of the model. Note that our , , , and correspond to their , , , and , respectively. Identifiability is considered in [1] by specifying as for some positive integer (which corresponds to in [1]) and as being of the form
[TABLE]
for some known functions . It is in fact imposed in [1] that and actually is a function of for each but this constraint can be dropped for achieving wider generality without additional theoretical difficulties. The specific form of in (3.4) amounts in our setting to specify the previous ODMX( with reduced link function as in (3.3) to a VLODMX() with , for , and for . Then with and it follows that is a subset of .
4. Main results
4.1. General setting
To investigate the identifiability of the model, we first introduce an assumption which says how much can be identified from a single observation of the conditional distribution .
- (B-1)
For all there exists such that, for all and ,
[TABLE]
It can be convenient to write the parameters as so that only depends on , hence can be denoted by , and the link function only depends on , hence can be denoted by . In this case, the “if” in (B-1) holds by setting for , and the “only if” in (B-1) says that is one-to-one. In many examples does not depend on at all, in which case . See (SL’-3) below in Section 5.1 for such a case.
Our approach to establish identifiability is given by the following general result.
Proposition \theproposition.
Consider an ODM satisfying (A-1) with or an ODMX satisfying (A’-1) with . Let be a class of -measurable functions satisfying (A-2). Suppose moreover that (B-1) holds. Then, for all . we have
[TABLE]
where , and are respectively defined in Section 2.3, Assumption (B-1) and Section 2.3.
The proof is postponed to Section 6.2 for convenience. We now derive the main result of this section, which provides sufficient conditions in order to fully describe the set . To this end, we introduce the following assumption, in which, by saying that a probability measure on is non-degenerate with respect to the class , we mean that, for any , can only be true if .
- (A-7)
For all and , the measure defined by (2.6) on is non-degenerate with respect to the class ,
where denotes the class containing all sets for which there exist , , , and such that
[TABLE]
Remark 6**.**
The non-degenerate assumption (A-7) is easy to check in the two following cases.
- (1)
If for all , and , we have , then for any set we have if and only if . Thus (A-7) is immediately satisfied. 2. (2)
In the VLODM case, that is, with reduced link function given by (2.8), we immediately see that only contains affine subsets (being the null space of an affine function). Hence we only need to require that does not have full measure on affine hyperplanes to ensure that it is non-degenerate with respect to the class .
In the case of an ODMX, the definition of is different, see Remark 1(5), Assumption (A-7) has to be adapted into the following.
- (A’-7)
For all and , the measure defined by (2.7) on is non-degenerate with respect to the class .
We have the following result.
Theorem 1**.**
Consider an ODM satisfying (A-1) and (A-7) with or an ODMX satisfying (A’-1) and (A’-7) with . Assume that (A-3)–(A-6) hold. For all , define and by (2.25). Then (A-2) holds and we have, for all ,
[TABLE]
where and are as in Section 2.3 and Section 2.3.
Proof.
The fact that (A-2) holds for the given choice of follows from Section 2.3.
Let us now take and and show that belongs to the right-hand side of (4.1). We prove this in the case of an ODMX satisfying (A’-1).(The case of an ODM is readily obtained by removing the variables ’s in the reasoning). By (2.17), (2.20) and (2.21), and since is stationary, we have, for all ,
[TABLE]
Using (2.3) and the notation introduced in Section 2.2, we get that, for any ,
[TABLE]
We now show that this implies
- (Hk)
For all , we have
by iterative reasoning on . First observe that (4.2) corresponds to H0 (since is an empty sequence). Now assume that Hk holds for some . Then, for any , the set
[TABLE]
has probability 1 under the -conditional probability of given . This conditional probability is defined by (2.7). By (A’-7) and since, given , we have , , we obtain that Hk+1 is true. Reasoning by induction, this leads to Hn+1, and finally, we get that
- (H)
there exists such that for all and all , .
Now let . By definition of , there exists , such that . Now, for all , we have
[TABLE]
where we chose in order to apply Assertion (H) in the second equality. On the other hand, by (2.26), we have
[TABLE]
where we again used that were chosen as in (H) in the last equality. With (A-6) and the previous display we obtain . This is true for an arbitrary ; hence, we have obtained that the left-hand side of (4.1) is included in its right-hand side.
We now prove the opposite inclusion. Let such that
[TABLE]
Let . Take an arbitrary . Then there exists such that . For all and , we get that
[TABLE]
Applying (4.3) recursively in , we get that, for any ,
[TABLE]
Hence, and by (2.26) and (2.27), we get that for all , . By Section 2.3 we have , , and using (2.31), we get that , which concludes the proof. ∎
Note that in (4.1), as in the left-hand side of (2.30), the functions and are only required to coincide on whereas in the left-hand side of (2.16), they coincide on the whole set . In some cases, we can prove that the two conditions are the same, so that Section 4.1 and Theorem 1 allow us to conclude that the inclusion in (2.16) is in fact an equality, as in the following result.
Corollary \thecorollary.
Suppose that the assumptions of Theorem 1 and (B-1) hold. Let . Then the inclusion in (2.30) is an equality. Suppose moreover that satisfies the following additional assumption.
- (A-8)
For all , and , if and coincide on the set , then they also coincide on .
Then the inclusion in (2.16) is an equality.
Proof.
Applying Section 4.1 and Theorem 1, we get that
[TABLE]
Observing that, by (B-1), is equivalent to have , we get that the inclusion in (2.30) is an equality. To prove the second assertion of the corollary, we only need to check that, under (A-8), for all and , if and coincide on , they must also coincide on . It suffices to show that there exists such that . This inclusion is true if and we conclude by observing that is not empty since it contains , , as a consequence of Remark 4. ∎
Remark 7**.**
A simple case where (A-8) in Section 4.1 is easy to check is when , so that and for all and . See the proof of Theorem 4 for a specific example. However it may happen that (A-8) is not satisfied as will be seen in Remark 8(5). In the linear case, we will characterize in Section 4.2 without relying on (A-8).
4.2. Vector linear setting
We now consider a VLODM() or a VLODMX(), that is, we assume the reduced link function to be of the form (2.8). We set in this case (resp. ) where denotes an arbitrary norm in (resp. ). The general conditions reduce to the following set of conditions.
- (L-1)
For all , we have that is invertible for all with ,
where denotes the identity matrix of order .
- (L-2)
For all and , the measure defined on by (2.6) is non-degenerate in the following sense : there is no affine hyperplane such that .
Note that, if , affine hyperplanes are singletons, hence (L-2) simply means that, for all and , does not reduce to a unit mass concentrated on a single point. In the case of a VLODMX, we replace (L-2) by the following.
- (L’-2)
For all and , the measure defined on by (2.7) is non-degenerate in the following sense : there is no affine hyperplane such that .
Finally the moment condition (A-5) simplifies in the vector linear case to
- (L-3)
The invariant probability measure of Section 2.3 satisfies, for all ,
[TABLE]
We have the following result, whose proof is postponed to Section 6.3 for convenience, that relates this set of assumptions to the general ones.
Lemma \thelemma.
Consider the vector linear setting where (2.8) holds, and and are closed subset of and , respectively, with and being the metrics induced by norms on these spaces. The following assertions hold.
- (i)
Assumption (A-3) holds. 2. (ii)
Assumption (A-4) is equivalent to (L-1). 3. (iii)
Assumption (L-3) implies (A-5) for any , and any . 4. (iv)
Assumption (A-6) holds. 5. (v)
Assumption (L-2) implies (A-7). 6. (vi)
Assumption (L’-2) implies (A’-7).
As a consequence, the assumptions of Theorem 1 are implied by (A-1), (L-1), (L-2) and (L-3) in the VLODM case and by (A’-1), (L-1), (L’-2) and (L-3) in the VLODMX case.
We now provides a simple characterization of in (4.1) in the linear case. The proof of the following result can be found in Section 6.4.
Lemma \thelemma.
Suppose that and that (L-1) holds, and let be as in Section 2.3. For all , define as the rational matrix
[TABLE]
which is well defined on except for at most finitely many ’s. Then, for all , the two following assertions are equivalent.
- (i)
We have for all and . 2. (ii)
The two following identities hold
[TABLE]
The identification of a parameter based on the equation (4.7) is similar to the identifiability of a vector auto-regressive moving average or order (VARMA()) model with AR matrices and MA matrices . Indeed, in such a model the spectral density matrix takes the form
[TABLE]
where is the covariance matrix of the noise. We refer to [18] where identifiable parametrization of ARMA models are discussed. Below we provide an important related result related to this general issue. Let be positive integers. For all and , let us define the polynomial matrices respectively valued in and
[TABLE]
Note that, for all , in (4.8) must be invertible for large enough (since then dominates). When a polynomial matrix is invertible for at least one , then it is invertible for all , except at most a finite number of them. It is then said to be non-singular. Thus, for all and , we can define the rational matrix , which is well defined for all , except at most a finite number of them.
Lemma \thelemma.
Let be positive integers. Then, for any and , the two following assertion holds.
- (i)
Suppose that and are left coprime. Then, for all and , we have
[TABLE]
if and only if and . 2. (ii)
Suppose that and are not left coprime. Then, there exist and such that, for all , setting and , we have
[TABLE]
Two polynomial matrices with the same number of rows are said to be left coprime if they admit the identity matrix as a greatest common left divisor (g.c.l.d.), that is, every common left divisor of them is also a left divisor of . The set of polynomial matrices of order is a non-commutative ring for . This is why for a notion of left (or right) divisor is necessary. Note, however that if , saying that and are left coprime is equivalent to say that and the row entries of (which is -dimensional row vector of polynomials of degree at most ) are coprime, that is, they have 1 as greater common divisor. In particular if , this boils down to say that and have no common roots. The case is significantly more complicated and we refer to [22, Chapter III] for an excellent introduction on polynomials on Euclidean rings that applies to matrices of polynomials.
Proof.
Let and . In this proof section, for convenience we denote and by and .
Proof of Assertion (i). Suppose that and are left coprime. The Bezout theorem for matrices of polynomials (see e.g. [22, Theorem 3.1]) gives that there exists two polynomial matrices and of order and respectively, such that
[TABLE]
Let and , and denote and . The “if” in Assertion (i) is obvious, so we only need to assume that
[TABLE]
and prove that (in which case we also get that ). Define the rational matrix . Multiplying both sides of (4.9) by from the left, we have
[TABLE]
where we used (4.10) in the second equality. Hence we get that is a polynomial matrix and since and both and are of the form a polynomial of degree at most , we get that and so , which concludes the proof of (i).
Proof of Assertion (ii). Suppose that and are not left coprime. Let be a g.c.l.d. of . Then is a polynomial matrix that left-divides and and changing this polynomial matrix won’t change the rational matrix . The difficulty is to show that we can modify a left divisor of and in such a way that the resulting and are still of the form (4.8) for some well chosen and . To this end we must first choose in a special form. Indeed, for any unimodular polynomial matrix , is also a g.c.l.d. of . The polynomial matrix is called a right-associate of , and by [22, Theorem 22.1], we can choose so that is in Hermite normal form, that is such that is triangular inferior,
[TABLE]
with the polynomial is unitary with degree strictly larger than those of for all (that is, the degree on the diagonal dominates those of the same column). Let
[TABLE]
From what precedes, this min exists (the set is not empty), otherwise we would have and and would be left coprime. We can thus write, for some ,
[TABLE]
where is the zero matrix of size . By convention, if , reduces to , that is back to its previous form. The important point is that with this definition of , we know that is unitary with . Now, since is a left divisor of and we may write
[TABLE]
for some matrices and of respective sizes and . We write and in a block matrix form compatible with that of , that is
[TABLE]
with and of respective sizes and . (again with convention that these matrices vanish if ). Then we have
[TABLE]
In particular the first row of is times the first row of , and since is unitary with , the form of implies that the first row of is made of polynomials of degrees at most and cannot be zero (since the degree of the entry row of is exactly and all the other entries are zero). Similarly, the first row of is made of polynomials of degrees at most (since is of degree ). Let denote the diagonal matrix od order with zeros on its diagonal except on the -th entry where it is 1. Since and only keeps the first rows of the block matrices and , respectively, and put all other entries to zero, we get from what precedes that and are of degree at most and , respectively, and that is not zero. Hence we may find and (with ) such that
[TABLE]
Then, for all , setting and , we have
[TABLE]
and, similarly,
[TABLE]
Then we get
[TABLE]
which concludes the proof of Assertion (ii). ∎
5. Applications
We now use our results to derive necessary and sufficient conditions for having identifiability in the examples of Section 3. Many other examples can be achieved by combining various observation kernels and link function with or without exogenous covariates.
5.1. Standard LODMs
Let us apply the results of Section 4.2 in the case of standard LODMs as defined in Section 3.1. The moment assumption (L-3) can be readily used for standard LODMs with known or unknown observation kernel, with in (4.4) denoting the usual absolute value. The other assumptions of the general VLODM listed in Section 4.2 can be simplified as follows.
For a standard LODM, Assumption (L-1) becomes
- (SL-1)
For all , we have for all such that .
In the case of a standard LODM with unknown observation kernel it becomes
- (SL’-1)
For all with , we have for all such that .
As for (L-2), it becomes
- (SL-2)
For all , does not degenerate to a single point for , that is, for all , we have ;
and, in case of an unknown observation kernel,
- (SL’-2)
For all , and , we have .
Finally (B-1) becomes
- (SL-3)
For all , if and only if ;
and, in the case of an unknown observation kernel, it reads as
- (SL’-3)
For all and in , for all , we have
[TABLE]
This says that the class in (B-1) is given by .
Remarkably, all LODMs share the same necessary and sufficient condition for identifiability, which can be expressed as follows, using and to denote the true linear coefficients of the linear link function.
- (SL-4)
The polynomials and have no common complex roots.
We can now state the following result, which says that the true parameter in the interior of is identifiable if and only if (SL-4) holds.
Theorem 2**.**
Consider a standard LODM satisfying (A-1) and (L-3) for some , and suppose that . In the case of a known observation kernel, suppose that (SL-1)–(SL-3) hold. In the case of an unknown observation kernel, suppose that (SL’-1)–(SL’-3) hold. Then the inclusion in (2.30) is an equality for all . In the case of a known observation kernel, Assertions (i) and (ii) below hold for any . In the case of a known observation kernel, Assertions (i) and (iii) below hold for any .
- (i)
Condition (SL-4) implies that reduces to the singleton . 2. (ii)
If Condition (SL-4) does not hold, then there exists an open segment of positive length and containing such that . 3. (iii)
If (SL-4) does not hold, then there exists an open segment of positive length and containing such that .
Remark 8**.**
Let us briefly comment this result.
- (1)
The ergodicity of all the examples of Section 3.1 have been studied in the provided references and the parameter set is always chosen to satisfy the assumptions of Theorem 2 in these references. 2. (2)
If , condition (SL-4) is reduced to . Let us see what would imply about the identifiability of the model in this simple case. Taking as in (2.8) with , if , and , then is a deterministic sequence which, under the stationary distribution, has to be constantly equal to . But since the distribution of is then uniquely defined by this constant, if one can find a parameter with corresponding coefficients such that , yielding the same constant , we see that the model is not identifiable. 3. (3)
Condition (SL-4) holds for “many” parameters , e.g. for Lebesgue almost all ones in . 4. (4)
The identifiability condition (SL-4) is a well known sufficient condition in the standard GARCH models, see [14, (A4)] or [2, Condition (2.27)]. Assertion (ii) in Theorem 2 shows that it is also necessary at least for all parameters in the interior set of . 5. (5)
Suppose that and both contain at least two different points and take and to be all non-zero. Then it is easy to show for a standard LODM with known observation kernel that for all and implies and thus, we get that the left-hand side of (2.16) reduces to the singleton . Since, as explained previously, (SL-4) is necessary to have for all in the interior set of , we easily get examples for which the inclusion in (2.16) is strict. 6. (6)
Theorem 2 can be applied to all the models mentioned in Section 3.1. Let us examine the case of the MPINGARCH() model of [25], which constitutes a rich class of integer valued models. An MPINGARCH() model is an LODM() model with unknown observation kernel defined as a mixed Poisson distribution with mean and variance proportional to , and . In [25, Theorem 1], the sufficient conditions for having (A-1) imply (L-3) (since ) and (SL-1) (since they imply , with ). Conditions (SL’-1) and (SL’-2) also hold by definition of . Hence Theorem 2 applies and we get that (SL-4) is a sufficient condition for identifiablity. It is also necessary by Assertion (iii) of the theorem, at least for parameters in the interior of . This condition seems to be missing in [25, Theorem 2].
Proof of Theorem 2.
We only consider the case with unknown observation kernel (the case with known observation kernel is obtained by removing the additional parameter ).
Let with . As explained previously, the assumptions of Theorem 2 are adapted from those derived in Section 4.2. In particular, we have that (L-1)–(L-3) hold. Hence Section 4.2 implies that (A-3)–(A-7) hold in the general setting with . Applying Theorem 1, we get that (A-2) holds and that
[TABLE]
where and are as in Section 2.3 and Section 2.3. Remember that (SL’-3) says that (B-1) holds with
[TABLE]
By Section 4.1, we get that the inclusion in (2.30) is an equality and
[TABLE]
Note that we assumed that and that (SL’-2) implies , hence we can apply Section 4.2 which gives that is the set of all such that
[TABLE]
Applying Section 4.2, we easily get Assertions (i) and (iii). ∎
5.2. The bivariate example of Section 3.2
Theorem 2 can be extended to the standard VLODM case of Section 3.2. Here, for brevity, we do not re-express the general VLODM assumptions (L-1)-(L-3) in the standard setting as we did previously for standard LODMs. We only need to introduce the condition
- (SL’-4)
The polynomials and are left-coprime,
which extends (SL-4) to the case . The proof of the following result mimics the one of Theorem 2 and is thus omitted.
Theorem 3**.**
Consider a standard VLODM satisfying (A-1) for some . Suppose that and that (L-1)-(L-3) and (SL’-3) hold. Then, for all , the inclusion in (2.30) is an equality and the two following assertions hold.
- (i)
Condition (SL’-4) implies that reduces to the singleton . 2. (ii)
If (SL’-4) does not hold, then there exists an open segment of positive length and containing such that .
Remark 9**.**
In Theorem 3, for brevity, we only stated the case with unknown observation kernel. The case with known observation kernel follows by removing the parameters and in the statement and by replacing (SL’-3) by (SL-3).
Since the bivariate integer valued GARCH model of Section 3.2 is a standard VLODM() with unknown observation kernel, we just need to check the assumptions of Theorem 3. Ergodicity (hence our Assumption (A-1)) is stated in [9, Theorem 1] under their set of condition (a) on the parameter . Their conditions for ergodicity implies some operator norm of to be strictly less than 1, which implies to be invertible for and thus (L-1) holds. Since for all , the bivariate distribution defined by (3.1) has positive probability on all points , it cannot have probability one on a line of , hence (L-2). Also it is claimed following [9, Theorem 1] that is well defined and thus (L-3) holds. Applying Theorem 3, we get that, for any interior point of the parameter space, Condition (SL’-4) is a necessary and sufficient condition to have identifiability of . In the bivariate integer valued GARCH model (for which ), this condition reads for as and to be left coprime. If does not satisfy this condition, consistent estimation of is not possible. Hence we believe that this assumption is missing in [9, Theorem 2]. A precise counter-example is for instance obtained by setting
[TABLE]
where are arbitrary (and chosen in order to make in the interior of ). One can show that and are not left coprime since they both admit the same non-unimodular left divisor as shown by the following identities:
[TABLE]
5.3. The non-linear GARCH of Section 3.3
Consider Case 1) of Section 3.3, for which is not included in the set of parameters. Then the non-linear GARCH model is a standard VLODM() model and identifiability can be treated using Theorem 3. Using that , for all , in (2.6) (we omit as does not depend on here) has support included in , where it is defined, for all Borel set by
[TABLE]
It follows that our Assumption (L-2) is equivalent to having that and that there is no pair , , such that , which is exactly the condition appearing in the second part of [17, A3]. Our conditions (L-1) (which here, since for all simply reads ) and (L-3) are usual byproducts of showing the ergodicity condition (A-1), see [17, Appendix A]. One can thus apply our Theorem 3 (in its know observation kernel version, see Remark 9) and obtain the necessary and sufficient condition (SL’-4) which in the case where the parameters are denoted by for and the parameters denoted by for , becomes, for any with for ,
- (NLG-1)
The polynomial have no common complex roots neither with the polynomial nor with the polynomial .
This condition is similar to that appearing in the identifiability condition [17, A4] used in a mis-specified context. Our result shows that this condition is necessary in the interior of the parameter set in the well-specified case, and remains valid for much larger choices of observation kernels.
5.4. The SETPAR model of Section 3.4
We have the following result for the self-excited threshold Poisson autoregression model.
Theorem 4**.**
Consider the SETPAR model introduced in Section 3.4. Let
[TABLE]
Then (A-1) holds. Let satisfy at least one of the two following conditions.
- (i)
* and ;* 2. (ii)
.
Then we have
[TABLE]
where is defined by (3.2).
Remark 10**.**
Let us briefly comment this result.
- (1)
The case where neither (i) nor (ii) hold ( or ) is somehow degenerate, similarly to the non-threshold case mentioned in Remark 8(2). We think it should be treated separately but we omit this very special case here for brevity. 2. (2)
As explained after (2.16), the identity (5.1) is the best we could hope for this model since the distribution of the observations is entirely determined by the mapping on . 3. (3)
The identity (5.1) shows in particular that is not identifiable if (since changing will have no effect on the mapping on ). Another case of non-identifiability is when and . In such a case, we have for all ,
[TABLE]
Then, setting , we immediately have that for all and . In particular consistent estimation of as claimed in [28, Theorem 2] is not possible for such a parameter .
Proof of Theorem 4.
A natural choice for is but in order to meet Assumption (A-3) with we take with arbitrarily set to be Bernoulli with mean for convenience (it actually has no influence on since in the condition on ). We set so that for all and the reduced link function is the same as the non-reduced one. Moreover since , we are in the case where for all , and for all . By [28, Theorem 1], with satisfying the given condition, Assumption (A-1) holds, and, moreover, for any and , (in fact, on can prove that, for any , there exists such that ). This moment condition implies that the log moment condition (A-5) holds for any . Clearly, we have and, since , for all . Thus the above condition on also implies (A-4). As for (A-6), it trivially holds (since is fixed in this condition). Hence with Section 2.3, we get that (A-1)–(A-6) holds, with definitions (2.26) for checking (A-2). Assumption (B-1) is also immediate with and Section 4.1 gives that, for any ,
[TABLE]
where can be defined by (2.31). Assumption (A-7) holds by Remark 6(1). Hence all the assumptions of Theorem 1 hold. Take now satisfying (i) or (ii). Let , and . To conclude the proof, it is now sufficient to check that if and coincide on , then they must coincide on , so that we can apply Section 4.1. Observe that by definition of the link function in (3.2), if (i) or (ii) holds, then takes at least two different values on . Since for all , these two different values belong to . Now, since and are affine functions, if they coincide in two points they must coincide everywhere, and the proof is concluded. ∎
5.5. The PARX model of Section 3.5
We have the following result for the Poisson autoregression model with exogenous covariates.
Theorem 5**.**
Consider the PARX model defined in Section 3.5, which is a VLODMX(). Suppose that (A’-1), (L-1) and (L-3) hold, and that the exogenous kernel satisfies the following.
- (P-1)
We have for all and affine hyperplanes .
Then, for all , the equivalent class reduces to the singleton .
Remark 11**.**
Let us briefly comment this result.
- (1)
(P-1) is a natural assumption as it basically says that the covariates are not linearly related conditionally to . If they were, it would suggest using a smaller set of covariates. 2. (2)
The ergodicity of PARX models (our assumption (A’-1) is treated in [1, Theorem 1] under some assumption on the covariate kernel (in their assumption 2). Their assumption 3 used for proving (A’-1) implies which implies our assumption (L-1) (since for all ). Note also that [1, Theorem 1] implies that and thus our assumption (L-3). On the other hand, their identifiability condition [1, Assumption 5] include a condition on parameters similar to our condition (SL-4) used for standard LODM’s in Theorem 2 above. Theorem 5 shows that such a condition can in fact be dropped in case of exogenous covariates provided that the mild condition (P-1) holds. 3. (3)
This result is similar to Theorem 2 for the standard LODM. It is of interest to note that there is no additional condition on for identifiability. 4. (4)
Theorem 5 easily extends to more general observation kernels (known or unknown) , provided that, as in Theorem 2, assumptions (SL-2)-(SL-3) hold if the observation kernel is known, or (SL’-2)-(SL’-3) if it is unknown. However, the ergodicity would require a specific treatment in these cases, as only the Poisson case has been considered up to our knowledge.
Proof of Theorem 5.
For the PARX model we set and (with arbitrarily set, say, to be Bernoulli with mean , as in the proof of Theorem 4) and (A-3) holds with and where here is an arbitrary norm on . In this case, we have that, for all , if with valued in , valued in and defined by (2.7), we have that and are independent and follows a Poisson distribution. Thus (L’-2) is equivalent to (P-1). We can thus apply Section 4.2 and get that Assumptions (A-3), (A-4), (A-5), (A-6) and (A’-7) hold with . Then, having assumed (A’-1), we can apply Theorem 1 and get that (A-2) holds as well as the identity (4.1). Assumption é(B-1) is immediate with and, Applying Section 4.1 and the previous display we get that
[TABLE]
where and are as in Section 2.3 and Section 2.3. Section 4.2 and the definitions of and in Section 3.5 now give that is the set of all such that
[TABLE]
Note that the last line in this set of equations is equivalent to
[TABLE]
But the the two first lines of the previous display give that and , thus and the proof is concluded. ∎
6. Postponed Proofs
6.1. Proof of Section 2.3
We first derive the following result.
Lemma \thelemma.
(A-4)* implies that for all , there exist and such that for all .*
Proof.
By (2.22), (2.23) and (2.15), we have, for all , using the convention for ,
[TABLE]
Hence (A-4) implies that there exists and such that, for all , is -Lipschitz. Now observe that, by (2.14), for all with and , for all , we can write as
[TABLE]
and in this composition, the first functions are Lipschitz and the last one is -Lipschitz. Hence, for all ,
[TABLE]
Hence the result by setting . ∎
We can now prove Section 2.3. Let and let . Denote for all ,
[TABLE]
Then, by (2.13) we have, for all ,
[TABLE]
and, by (2.22), we get
[TABLE]
Using (2.23) with and and we get that
[TABLE]
Hence, for all , Condition (2.24) implies as ,
[TABLE]
The last display with (6.2) and Section 6.1 gives that, , is a Cauchy sequence, hence converges in . Therefore, , By (2.1) or (2.2) depending whether an ODM or an ODMX is considered, we also have, under , for all , . Thus, (2.22) also implies
[TABLE]
By stationarity, is bounded in probability under , hence converges to in probability if (A-4) holds. We thus obtain (2.17), and since this holds for all , Assumption (A-2) holds.
Let us now check (2.30). Take and suppose that belongs to the set in left-hand side of the inclusion (2.30), that is, and for all . By Remark 4 we have , Thus we get that and with Remark 2, we obtain that, , (resp. ) satisfy the iterative equations (2.1) (resp. (2.2)), and by (A-1) (resp. (A’-1)), we conclude that . Hence and (2.30) is proved.
Finally, we check that (2.21) holds when is continuous for all . Since we have shown that , and is shift invariant, we have, for all ,
[TABLE]
Observe that, for all , we have, for all ,
[TABLE]
By continuity of and using the previous display, we can take the limit as under and obtain (2.21).
6.2. Proof of Section 4.1
First observe that (2.17) implies for all ,
[TABLE]
Let us now show that any belongs to , that is, , and (2.20) and (2.21) hold true. Since , (6.3), which also holds with replaced by , yields
[TABLE]
By (B-1), we obtain that and (2.20) holds. By Section 2.3, (2.17) implies (2.18), and using , we obtain (2.21). Thus .
It remains to show that . We prove this inclusion in the case of an ODMX satisfying (A’-1). (The case of an ODM is readily obtained by removing the variables ’s in the reasoning). Let such that (2.20) and (2.21) hold true. Since (2.17) holds with replaced by , (2.20) gives that Since is shift invariant, we get, for all , With (2.21), we obtain Since is shift invariant, we thus have, for all ,
[TABLE]
On the other hand, by definition of and using (B-1) with , we have that
[TABLE]
And using again that is shift-invariant, for all ,
[TABLE]
This, with (6.4), shows that is a shift-invariant solution of (2.2). By (A’-1), we conclude that , and thus .
6.3. Proof of Section 4.2
Assertion (i) is obvious.
Proof of Assertion (ii). In the vector linear setting we have, for all and all ,
[TABLE]
where is a norm on and, for all , is a linear mapping from to recursively defined by setting, for all , with
[TABLE]
In particular, we have, for all ,
[TABLE]
and this equation is also true for if the last , -valued, component of are equal to zero. It is well known that the Lipshitz norm of such iterative linear functions goes to zero if and only if (L-1) holds.
Proof of Assertion (iii). Take an arbitrary . If , take also an arbitrary and set . Then, since is of the form (2.8), there exists constants only depending on , and such that, for all ,
[TABLE]
Assertion (iii) follows.
Assertion (iv) is obvious.
Proof of Assertions (v) and (vi). See Remark 6 (2) in the case of a VLODM. The case of a VLODMX is similar.
6.4. Proof of Section 4.2
Note that, by Section 4.2 (ii), in the vector linear case, the set of Section 2.3 is well defined under (L-1). We need the following result whose proof is straightforward, and thus omitted.
Lemma \thelemma.
Suppose that (L-1) holds and let . Let denote the set of sequences in that are absolutely summable. For any , there is a unique (the set of bounded sequences valued in ) such that
[TABLE]
This unique solution is given by
[TABLE]
where is defined by (4.5) and denotes the Fourier series of defined by
[TABLE]
Let and be as in (2.25) and Section 2.3. Then contains and, for any , defining as the unique solution of (6.6) in , we have, for all ,
[TABLE]
We can now prove Section 4.2.
Proof of Section 4.2.
Step 1: Assertion (i) implies Assertion (ii). Let satisfying Assertion (i) and let us show that (4.6) and (4.7) hold. Take any and . By Section 6.4, and we have
[TABLE]
where the second equality follows from applying successively Assertion (i) with and for . On the other hand by definition of in (2.22), we have, setting and ,
[TABLE]
Since , we have that and are summable sequences (as a consequence of Section 6.4) and so is bounded as . Using Section 4.2 (ii) and we conclude that the upper bound in the last display converges to 0 as . By definition of and the previous display, this gives that . Shifting the sequence , we also have that for all and by Section 6.4, this implies that and share the same unique solution to the equation (6.6). Using the explicit form of this solution in the same lemma, we get that, for all and ,
[TABLE]
where, for all , and , we set . Since we assumed , we can successively take as the zero sequence ( for all , implying ) or proportional to the impulse sequence (, for all , implying ), the previous display successively leads to (4.6) and
[TABLE]
which implies (4.7).
Step 2: Assertion (ii) implies Assertion (i). Let satisfying (4.6) and (4.7), and let us show that Assertion (ii) holds. First take . By Section 6.4, (4.6) and (4.7) imply that the recursive equation (6.6) and the one with replaced by share the same bounded solution. Moreover, we have and since are given by the same solution they are equal. Hence we obtain that
[TABLE]
where . To get Assertion (i), since is continuous for and for any , it is now sufficient to prove that is dense in . To this end, pick . Then there exists such that
[TABLE]
Define, for any , we introduce the truncated sequence
[TABLE]
Then and we have
[TABLE]
Moreover, we can write, denoting by the null sequence in ,
[TABLE]
By (2.26) and (6.7) we thus have , and since is arbitrary in we have shown that is dense in and the proof is concluded. ∎
Acknowledgments
In the first version of this contribution, we neither investigated the VLODM case nor the case with exogenous covariates. We are grateful to the two referees that reviewed the first submission, whose fruitful and constructive comments motivated these demanding extensions.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Agosto et al. [2016] Arianna Agosto, Giuseppe Cavaliere, Dennis Kristensen, and Anders Rahbek. Modeling corporate defaults: Poisson autoregressions with exogenous covariates (parx). Journal of Empirical Finance , 38:640 – 663, 2016. ISSN 0927-5398. doi: https://doi.org/10.1016/j.jempfin.2016.02.007 . URL http://www.sciencedirect.com/science/article/pii/S 0927539816300214 . Recent developments in financial econometrics and empirical finance.
- 2Berkes et al. [2003] István Berkes, Lajos Horváth, and Piotr Kokoszka. GARCH processes: structure and estimation. Bernoulli , 9(2):201–227, 2003. ISSN 1350-7265. doi: 10.3150/bj/1068128975 . URL https://doi.org/10.3150/bj/1068128975 . · doi ↗
- 3Bhaskaran et al. [2013] Krishnan Bhaskaran, Antonio Gasparrini, Shakoor Hajat, Liam Smeeth, and Ben Armstrong. Time series regression studies in environmental epidemiology. International journal of epidemiology , page dyt 092, 2013.
- 4Bollerslev [2008] Tim Bollerslev. Glossary to arch (garch). Technical report, CREATES Research Paper, September 2008.
- 5Bougerol and Picard [1992] P. Bougerol and N. Picard. Stationarity of garch processes and of some nonnegative time series. J. Econometrics , 52(1992):115 – 127, 1992. ISSN 0304-4076. doi: 10.1016/0304-4076(92)90067-2 .
- 6Christou and Fokianos [2015 a] Vasiliki Christou and Konstantinos Fokianos. Estimation and testing linearity for non-linear mixed Poisson autoregressions. Electron. J. Stat. , 9(1):1357–1377, 2015 a. doi: 10.1214/15-EJS 1044 . URL https://doi.org/10.1214/15-EJS 1044 . · doi ↗
- 7Christou and Fokianos [2015 b] Vasiliki Christou and Konstantinos Fokianos. On count time series prediction. Journal of Statistical Computation and Simulation , 85(2):357–373, 2015 b.
- 8Cox [1981] DR Cox. Statistical analysis of time-series: some recent developments. Scand. J. Statist. , 8(2):93–115, 1981. ISSN 0303-6898.
