Necessary and sufficient conditions for the identifiability of   observation-driven models

Fran\c{c}ois Roueff (IDS; S2A); Randal Douc (TIPIC-SAMOVAR; CITI); Ois; Roueff; Tepmony Sim (ITC)

arXiv:1904.02893·math.ST·May 13, 2020

Necessary and sufficient conditions for the identifiability of observation-driven models

Fran\c{c}ois Roueff (IDS, S2A), Randal Douc (TIPIC-SAMOVAR, CITI), Ois, Roueff, Tepmony Sim (ITC)

PDF

Open Access

TL;DR

This paper establishes necessary and sufficient conditions for the identifiability of observation-driven models, including GARCH and integer-valued time series models, ensuring the consistency of estimators.

Contribution

It extends the identifiability conditions from GARCH models to a broader class called linearly observation-driven models, covering various standard time series models.

Findings

01

Identifiability conditions are established for a broad class of models.

02

Conditions ensure the consistency of quasi-maximum likelihood estimators.

03

Includes standard models like Poisson GARCH and NBIN-GARCH.

Abstract

In this contribution we are interested in proving that a given observation-driven model is identifiable. In the case of a GARCH(p, q) model, a simple sufficient condition has been established in [1] for showing the consistency of the quasi-maximum likelihood estimator. It turns out that this condition applies for a much larger class of observation-driven models, that we call the class of linearly observation-driven models. This class includes standard integer valued observation-driven time series, such as the log-linear Poisson GARCH or the NBIN-GARCH models.

Equations249

Y_{k} ∣ F_{k} \sim G^{θ} (X_{k}; \cdot), X_{k + 1} = \tilde{ψ}_{U_{(k - q + 1) : k}}^{θ} (X_{(k - p + 1) : k}),

Y_{k} ∣ F_{k} \sim G^{θ} (X_{k}; \cdot), X_{k + 1} = \tilde{ψ}_{U_{(k - q + 1) : k}}^{θ} (X_{(k - p + 1) : k}),

V_{k} ∣ F_{k} \sim H (V_{(k - r) : (k - 1)}; \cdot), Y_{k} ∣ F_{k} \sim G^{θ} (X_{k}; \cdot), X_{k + 1} = \tilde{ψ}_{U_{(k - q + 1) : k}}^{θ} (X_{(k - p + 1) : k}),

V_{k} ∣ F_{k} \sim H (V_{(k - r) : (k - 1)}; \cdot), Y_{k} ∣ F_{k} \sim G^{θ} (X_{k}; \cdot), X_{k + 1} = \tilde{ψ}_{U_{(k - q + 1) : k}}^{θ} (X_{(k - p + 1) : k}),

Z_{k} = (X_{(k - p + 1) : k}, U_{(k - q + 1) : (k - 1)}) \in Z,

Z_{k} = (X_{(k - p + 1) : k}, U_{(k - q + 1) : (k - 1)}) \in Z,

Z = X^{p} \times U^{q - 1} endowed with the σ -field Z = X^{\otimes p} \otimes U^{\otimes (q - 1)} .

Z = X^{p} \times U^{q - 1} endowed with the σ -field Z = X^{\otimes p} \otimes U^{\otimes (q - 1)} .

ψ_{y_{1 : q}}^{θ} (x) = \tilde{ψ}_{u_{1 : q}}^{θ} (x) with u_{k} = Υ (y_{k}) for 1 \leq k \leq q .

ψ_{y_{1 : q}}^{θ} (x) = \tilde{ψ}_{u_{1 : q}}^{θ} (x) with u_{k} = Υ (y_{k}) for 1 \leq k \leq q .

\tilde{G}^{θ} (x, A)

\tilde{G}^{θ} (x, A)

\tilde{G}^{θ} ((x, v), A)

\tilde{ψ}_{u}^{θ} (x) = ω (θ) + i = 1 \sum p A_{i} (θ) x_{p - i} + i = 1 \sum q B_{i} (θ) u_{q - i},

\tilde{ψ}_{u}^{θ} (x) = ω (θ) + i = 1 \sum p A_{i} (θ) x_{p - i} + i = 1 \sum q B_{i} (θ) u_{q - i},

\tilde{ψ}^{θ} ⟨ u_{0 : (k - 1)} ⟩ (z) := x_{k},

\tilde{ψ}^{θ} ⟨ u_{0 : (k - 1)} ⟩ (z) := x_{k},

⎩ ⎨ ⎧ u_{j} = z_{p + q + j}, x_{j} = z_{p + j}, x_{j} = \tilde{ψ}_{u_{(j - q) : (j - 1)}}^{θ} (x_{(j - p) : (j - 1)}), - q < j \leq - 1, - p < j \leq 0, 1 \leq j \leq k .

⎩ ⎨ ⎧ u_{j} = z_{p + q + j}, x_{j} = z_{p + j}, x_{j} = \tilde{ψ}_{u_{(j - q) : (j - 1)}}^{θ} (x_{(j - p) : (j - 1)}), - q < j \leq - 1, - p < j \leq 0, 1 \leq j \leq k .

\tilde{Ψ}_{u}^{θ} : (x_{1 : p}, u_{1 : (q - 1)}) \mapsto ⎩ ⎨ ⎧ (x_{2 : p}, \tilde{ψ}_{(u_{1 : (q - 1)}, u)}^{θ} (x_{1 : p}), u_{2 : (q - 1)}, u) (x_{2 : p}, \tilde{ψ}_{u}^{θ} (x_{1 : p})) if q > 1 if q = 1,

\tilde{Ψ}_{u}^{θ} : (x_{1 : p}, u_{1 : (q - 1)}) \mapsto ⎩ ⎨ ⎧ (x_{2 : p}, \tilde{ψ}_{(u_{1 : (q - 1)}, u)}^{θ} (x_{1 : p}), u_{2 : (q - 1)}, u) (x_{2 : p}, \tilde{ψ}_{u}^{θ} (x_{1 : p})) if q > 1 if q = 1,

Z_{k + 1} = \tilde{Ψ}_{U_{k}}^{θ} (Z_{k}) .

Z_{k + 1} = \tilde{Ψ}_{U_{k}}^{θ} (Z_{k}) .

\tilde{Ψ}^{θ} ⟨ u_{0 : (k - 1)} ⟩ = \tilde{Ψ}_{u_{k - 1}}^{θ} \circ \tilde{Ψ}_{u_{k - 2}}^{θ} \circ \dots \circ \tilde{Ψ}_{u_{0}}^{θ} .

\tilde{Ψ}^{θ} ⟨ u_{0 : (k - 1)} ⟩ = \tilde{Ψ}_{u_{k - 1}}^{θ} \circ \tilde{Ψ}_{u_{k - 2}}^{θ} \circ \dots \circ \tilde{Ψ}_{u_{0}}^{θ} .

\tilde{ψ}^{θ} ⟨ u ⟩

\tilde{ψ}^{θ} ⟨ u ⟩

\tilde{Ψ}^{θ} ⟨ u_{0 : (k - 1)} ⟩ (z)

{θ \in Θ : G^{θ} = G^{θ_{⋆}} and \tilde{Ψ}_{u}^{θ} (z) = \tilde{Ψ}_{u}^{θ_{⋆}} (z) for all (z, u) \in Z \times U} \subseteq [θ_{⋆}] .

{θ \in Θ : G^{θ} = G^{θ_{⋆}} and \tilde{Ψ}_{u}^{θ} (z) = \tilde{Ψ}_{u}^{θ_{⋆}} (z) for all (z, u) \in Z \times U} \subseteq [θ_{⋆}] .

X_{t} = \tilde{ψ}^{θ} ⟨ U_{s : (t - 1)} ⟩ (Z_{s}) P^{θ} \mbox - a . s .

X_{t} = \tilde{ψ}^{θ} ⟨ U_{s : (t - 1)} ⟩ (Z_{s}) P^{θ} \mbox - a . s .

X_{1} = \tilde{ψ}^{θ} ⟨ U_{(- \infty) : 0} ⟩ P^{θ} \mbox - a . s .

X_{1} = \tilde{ψ}^{θ} ⟨ U_{(- \infty) : 0} ⟩ P^{θ} \mbox - a . s .

\tilde{ψ}^{θ} ⟨ U_{(- \infty) : 0} ⟩ = \tilde{ψ}_{U_{(- q + 1) : 0}}^{θ} ((\tilde{ψ}^{θ} ⟨ U_{(- \infty) : j} ⟩)_{- p \leq j \leq - 1}) \tilde{P}^{θ} \mbox - a . s .

\tilde{ψ}^{θ} ⟨ U_{(- \infty) : 0} ⟩ = \tilde{ψ}_{U_{(- q + 1) : 0}}^{θ} ((\tilde{ψ}^{θ} ⟨ U_{(- \infty) : j} ⟩)_{- p \leq j \leq - 1}) \tilde{P}^{θ} \mbox - a . s .

\tilde{P}^{θ} [Y_{1} \in \cdot ∣ Y_{(- \infty) : 0}] = G^{θ} (\tilde{ψ}^{θ} ⟨ U_{(- \infty) : 0} ⟩, \cdot) \tilde{P}^{θ} \mbox - a . s .

X_{k} = \tilde{ψ}^{θ} ⟨ U_{(- \infty) : (k - 1)} ⟩ P^{θ} \mbox - a . s .

X_{k} = \tilde{ψ}^{θ} ⟨ U_{(- \infty) : (k - 1)} ⟩ P^{θ} \mbox - a . s .

\tilde{ψ}^{θ} ⟨ U_{(- \infty) : k - 1} ⟩ = \tilde{ψ}_{U_{(k - q) : (k - 1)}}^{θ} ((\tilde{ψ}^{θ} ⟨ U_{(- \infty) : j} ⟩)_{k - p - 1 \leq j \leq k - 2}) \tilde{P}^{θ} \mbox - a . s .

\tilde{ψ}^{θ} ⟨ U_{(- \infty) : k - 1} ⟩ = \tilde{ψ}_{U_{(k - q) : (k - 1)}}^{θ} ((\tilde{ψ}^{θ} ⟨ U_{(- \infty) : j} ⟩)_{k - p - 1 \leq j \leq k - 2}) \tilde{P}^{θ} \mbox - a . s .

\tilde{P}^{θ} [Y_{k} \in \cdot ∣ Y_{(- \infty) : (k - 1)}] = G^{θ} (\tilde{ψ}^{θ} ⟨ U_{(- \infty) : (k - 1)} ⟩, \cdot) \tilde{P}^{θ} \mbox - a . s .

\tilde{ψ}^{θ} ⟨ U_{(- \infty) : 0} ⟩ = \tilde{ψ}^{θ_{⋆}} ⟨ U_{(- \infty) : 0} ⟩

\tilde{ψ}^{θ} ⟨ U_{(- \infty) : 0} ⟩ = \tilde{ψ}^{θ_{⋆}} ⟨ U_{(- \infty) : 0} ⟩

\tilde{ψ}^{θ} ⟨ U_{(- \infty) : 0} ⟩ = \tilde{ψ}_{U_{(- q + 1) : 0}}^{θ} ((\tilde{ψ}^{θ} ⟨ U_{(- \infty) : j} ⟩)_{- p \leq j \leq - 1})

Lip_{n}^{θ} = sup {\frac{δ _{X} ( ψ ~ ^{θ} ⟨ u ⟩ ( z ) , ψ ~ ^{θ} ⟨ u ⟩ ( z ^{'} ))}{δ _{Z} ( z , z ^{'} )} : (z, z^{'}, u) \in Z^{2} \times U^{n}},

Lip_{n}^{θ} = sup {\frac{δ _{X} ( ψ ~ ^{θ} ⟨ u ⟩ ( z ) , ψ ~ ^{θ} ⟨ u ⟩ ( z ^{'} ))}{δ _{Z} ( z , z ^{'} )} : (z, z^{'}, u) \in Z^{2} \times U^{n}},

δ_{Z} (v, v^{'}) = (1 \leq k \leq p max δ_{X} (v_{k}, v_{k}^{'})) ⋁ (p < k < p + q max δ_{U} (v_{k}, v_{k}^{'})) .

δ_{Z} (v, v^{'}) = (1 \leq k \leq p max δ_{X} (v_{k}, v_{k}^{'})) ⋁ (p < k < p + q max δ_{U} (v_{k}, v_{k}^{'})) .

E^{θ_{⋆}} [ϕ^{θ} (U_{0})] < \infty,

E^{θ_{⋆}} [ϕ^{θ} (U_{0})] < \infty,

ϕ^{θ} (u) = ln^{+} (δ_{X} (x_{1}^{(i)}, \tilde{ψ}_{(u^{(i)}, u)}^{θ} (x^{(i)})) \lor δ_{U} (u_{1}^{(i)}, u))

ϕ^{θ} (u) = ln^{+} (δ_{X} (x_{1}^{(i)}, \tilde{ψ}_{(u^{(i)}, u)}^{θ} (x^{(i)})) \lor δ_{U} (u_{1}^{(i)}, u))

{D^{θ} \tilde{ψ}^{θ} ⟨ u ⟩ := {u \in U^{Z_{\leq 0}} : \tilde{ψ}^{θ} ⟨ u_{(- n) : 0} ⟩ (z) converges in X as n \to \infty} := lim_{n \to \infty} \tilde{ψ}^{θ} ⟨ u_{(- n) : 0} ⟩ (z) for all u \in D^{θ},

{D^{θ} \tilde{ψ}^{θ} ⟨ u ⟩ := {u \in U^{Z_{\leq 0}} : \tilde{ψ}^{θ} ⟨ u_{(- n) : 0} ⟩ (z) converges in X as n \to \infty} := lim_{n \to \infty} \tilde{ψ}^{θ} ⟨ u_{(- n) : 0} ⟩ (z) for all u \in D^{θ},

{D^{θ} \tilde{Ψ}^{θ} ⟨ u ⟩ := {u \in U^{Z_{\leq 0}} : \tilde{Ψ}^{θ} ⟨ u_{(- n) : 0} ⟩ (z) converges in Z as n \to \infty} := lim_{n \to \infty} \tilde{Ψ}^{θ} ⟨ u_{(- n) : 0} ⟩ (z) for all u \in D^{θ},

{D^{θ} \tilde{Ψ}^{θ} ⟨ u ⟩ := {u \in U^{Z_{\leq 0}} : \tilde{Ψ}^{θ} ⟨ u_{(- n) : 0} ⟩ (z) converges in Z as n \to \infty} := lim_{n \to \infty} \tilde{Ψ}^{θ} ⟨ u_{(- n) : 0} ⟩ (z) for all u \in D^{θ},

\tilde{ψ}^{θ} ⟨ u ⟩

\tilde{ψ}^{θ} ⟨ u ⟩

\tilde{Ψ}^{θ} ⟨ u ⟩

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFinancial Risk and Volatility Modeling · Statistical Methods and Inference · Monetary Policy and Economic Impact

Full text

\newaliascnt

propositiontheorem \aliascntresettheproposition \newaliascntlemmatheorem \aliascntresetthelemma \newaliascntcorollarytheorem \aliascntresetthecorollary

\newaliascntdefinitiontheorem \aliascntresetthedefinition

Necessary and sufficient conditions for the identifiability of

observation-driven models

Randal Douc

,

François Roueff

and

Tepmony Sim

Département CITI

CNRS UMR 5157

Télécom SudParis

91000 Évry

France

[email protected]

LTCI

Télécom Paris

Institut Polytechnique de Paris

19 place Marguerite Perey,

91120 Palaiseau

France

[email protected]

Department of Foundation Year

Institute of Technology of Cambodia

12156 Phnom Penh

Cambodia

[email protected]

Abstract.

In this contribution we are interested in proving that a given observation-driven model is identifiable. In the case of a GARCH $(p,q)$ model, a simple sufficient condition has been established in [2] for showing the consistency of the quasi-maximum likelihood estimator. It turns out that this condition applies for a much larger class of observation-driven models, that we call the class of linearly observation-driven models. This class includes standard integer valued observation-driven time series such as the Poisson autoregression model and its numerous extensions. Our results also apply to vector-valued time series such as the bivariate integer valued GARCH model, to non-linear models such as the threshold Poisson autoregression or to observation-driven models with exogenous covariates such as the PARX model.

Key words and phrases:

identifiability, observation-driven models, time series of counts

2000 Mathematics Subject Classification:

Primary: 60J05, 62F12; Secondary: 62M05,62M10.

1. Introduction

Observation-driven models (ODM) were introduced in [8] and have received considerable attention since. They are commonly used for modeling various non-linear times series in applications ranging from economics (see [23]), environmental study (see [3]), epidemiology and public health study (see [29, 10, 12]), finance (see [20, 24, 13, 16]) and population dynamics (see [19]). Additional covariates have been added to some of these models leading to GARCHX type models, see [1] for recent examples in the context of count data, and the references therein. We include such a case in our setting leading to the general observation-driven models with exogenous variables (ODMX).

As often for non-linear time series the question of identifiability of the observation-driven models is a delicate one and is often appearing as an assumption used for proving the consistency (say) of the maximum likelihood estimator. A noticeable exception is the GARCH $(p,q)$ model, for which an explicit sufficient condition appears in [2], see their condition (2.27). We will in fact prove that this condition is not only sufficient but also necessary for the identifiability, and that this result extends to a much larger class of observation-driven models than the GARCH( $p,q$ ) model. See Theorem 2 below and the comments following this result.

We provide general conditions to ensure that an ODM or an ODMX defined through a collection of parameterized iterative schemes uniquely describes the law of the observations. In other words our conditions ensure that two different iterative schemes within the same model cannot produce the same law for the observations. Then a given parameter is identifiable if two different values of the parameter are not compatible with the same iterative scheme. Let us stress, however, that we do not consider the misspecified case here, that is, we always assume that the observations indeed follow the (unique) stationary distribution corresponding to (at least) one given parameter of the model. Our setting is nevertheless of interest for the misspecified setting since a non-identifiable parameter (in the well specified case) cannot be identified in the misspecified case. Hence the necessity of our conditions remains true for the misspecifed setting.

A special class of ODMs, that we call linearly observation driven models (LODMs) below, arises when the hidden variable is obtained linearly from hidden or observed variables of the past, and when all these variables are univariate, as for the GARCH( $p,q$ ) model. This latter model was extensively studied, see for example [5, 14, 15, 21, 16] and the references therein. Many other examples, linear or non-linear, univariate or multivariate, have been derived from this class, see [4] for a long list of them, although this list have been lengthened quite significantly since, in particular because of the recent adding of various integer valued ODMs to deal with count time series (see [7, 25] and the references therein). Our goal is to derive necessary and sufficient conditions potentially applying to a wide variety of ergodic observation driven models. To illustrate the generality of our results, we apply them to a list of various examples which includes, in addition to the standard GARCH model, the nonlinear GARCH model of [17], the INGARCH model of [12], the Log-linear Poisson GARCH of [13], the MPINGARCH model of [25], the PARX model of [1], the Bi-variate integer GARCH model of [9] and the self-excited threshold Poisson Autoregression of [28]. We are able to derive necessary and sufficient conditions for identifiability for all the considered examples.

The rest of the paper is organized as follows. Section 2 contains additional notation and definitions that will be used throughout the paper. Section 3 contains a list of examples already considered in the literature. Our main results can be found in Section 4, some proofs of which are postponed to Section 6. Before that, in Section 5, we show how our results apply to the examples of Section 3 or can be extended to larger classes of models.

2. Preliminaries

2.1. Formal definitions of observation driven models

Let us now formally introduce the class of observation-driven models and important sub-classes. Throughout the paper we use the notation $u_{\ell:m}:=(u_{\ell},\ldots,u_{m})$ for $\ell\leq m$ , with the convention that $u_{\ell:m}$ is the empty sequence if $\ell>m$ , so that, for instance $(x_{0:(-1)},y)=y$ . The observation-driven time series model can formally be defined as follows.

Definition \thedefinition (ODM, ODMX).

Let $(\mathsf{X},\mathcal{X})$ , $(\mathsf{Y},\mathcal{Y})$ and $(\mathrm{U},\mathcal{U})$ be measurable spaces, respectively called the latent space, the observation space and the admissible observation space. Let $(\Theta,\Delta)$ be a compact metric space, called the parameter space. Let $\Upsilon$ be a measurable function from $(\mathsf{Y},\mathcal{Y})$ to $(\mathrm{U},\mathcal{U})$ . Let $\left\{(x_{1:p},u_{1:q})\mapsto\tilde{\psi}^{\theta}_{u_{1:q}}(x_{1:p}):\theta\in\Theta\right\}$ be a family of measurable functions from $(\mathsf{X}^{p}\times\mathrm{U}^{q},\mathcal{X}^{\otimes p}\otimes\mathcal{U}^{\otimes q})$ to $(\mathsf{X},\mathcal{X})$ , called the reduced link functions and let $\left\{G^{\theta}:\theta\in\Theta\right\}$ be a family of probability kernels on $\mathsf{X}\times\mathcal{Y}$ , called the observation kernels. A time series $\{Y_{k}\,:\,k\geq-q+1\}$ valued in $\mathsf{Y}$ is said to be distributed according to an observation-driven model of order $(p,q)$ (hereafter, ODM $(p,q)$ ) with reduced link function $\tilde{\psi}^{\theta}$ , admissible mapping $\Upsilon$ and observation kernel $G^{\theta}$ if there exists a process $\{X_{k}\,:\,k\geq-p+1\}$ on $(\mathsf{X},\mathcal{X})$ such that for all $k\in\mathbb{Z}_{\geq 0}$ ,

[TABLE]

where $\mathcal{F}_{k}=\sigma\left(X_{(-p+1):k},Y_{(-q+1):(k-1)}\right)$ and $U_{j}=\Upsilon(Y_{j})$ for all $j>-q$ .

In the presence of exogenous variables defined as an $r$ -Markov chain valued in the space $(\mathsf{V},\mathcal{V})$ with kernel $H$ , the admissible mapping $\Upsilon$ is defined from $\mathsf{Y}\times\mathsf{V}$ to $\mathrm{U}$ and the iterative equation (2.1) is replaced by, for all $k\in\mathbb{Z}_{\geq 0}$ ,

[TABLE]

where, in this case, $\mathcal{F}_{k}=\sigma\left(X_{(-p+1):k},Y_{(-q+1):(k-1)},V_{((-r)\wedge(-q+1)):(k-1)}\right)$ and $U_{j}=\Upsilon(Y_{j},V_{j})$ for all $j>-q$ . We then say that the time series $\{Y_{k}\,:\,k\geq-q+1\}$ valued in $\mathsf{Y}$ is distributed according to an observation-driven model of order $(p,q)$ with $r$ -order Markov exogenous variables $\{V_{k}\,:\,k\geq((-r)\wedge(-q+1))\}$ (hereafter, ODMX $(p,q,r)$ ) with reduced link function $\tilde{\psi}^{\theta}$ , admissible mapping $\Upsilon$ , observation kernel $G^{\theta}$ , and exogenous Markov kernel $H$ .

The variables $Y_{k}$ are called the observed variables, the variables $X_{k}$ the hidden variables and the variables $U_{k}$ the admissible variables. In addition, we define the augmented variables

[TABLE]

which take values in the augmented space

[TABLE]

Remark 1.

Let us briefly comment on the unusual notion of admissible mapping which allows us to define the admissible variables $U_{j}=\Upsilon(Y_{j})$ of an ODM( $p,q$ ) in Section 2.1:

(1)

For all $k\geq 0$ , the conditional distribution of $(Y_{k},X_{k+1})$ given $\mathcal{F}_{k}$ only depends on $Z_{k}$ defined by (2.3). 2. (2)

The time series $\left\{U_{k}:k>-q\right\}$ is also an ODM $(p,q)$ with admissible mapping being the identity, link function $\tilde{\psi}^{\theta}$ and observation kernel $\tilde{G}^{\theta}(x,\cdot)=G^{\theta}(x,\Upsilon^{-1}(\cdot))$ on the observation space $(\mathrm{U},\mathcal{U})$ . 3. (3)

On the other hand, we can also set the admissible mapping to be the identity for the ODM $\left\{Y_{k}:k>-q\right\}$ , in which case $\mathsf{Y}\subseteq\mathrm{U}$ and the reduced link function should be replaced by the link function defined all $(x,y_{1:q})\in\mathsf{X}^{p}\times\mathsf{Y}^{q}$ by

[TABLE]

In fact the advantage of using an admissible mapping is precisely to obtain a reduced link function $\tilde{\psi}$ , more convenient than the (non-reduced) link function $\psi$ . We will focus in the particular case where $\tilde{\psi}$ is linear, into which we can cast not all but many observation driven models, see Section 3 hereafter. 4. (4)

An ODMX( $p,q,r$ ) can be cast into an ODM( $p,q$ ) by defining $\tilde{Y}_{k}=(Y_{k},V_{(k-r+1):k})$ and $\tilde{X}_{k}=(X_{k},V_{(k-r):(k-1)})$ and observing that the obtained times series $\{\tilde{Y}_{k}\,:\,k\geq-q+1\}$ is an ODM( $p,q$ ) with hidden variables $\{\tilde{X}_{k}\,:\,k\geq-p+1\}$ . However for treating identifiability as is the purpose here, it is more convenient to keep distinguishing between the ODM and the ODMX setting. 5. (5)

In the following the variables $U_{k}$ and $Z_{k}$ will be used extensively as they simplify a lot the presentation and the reasoning. It is important to note that the definitions of $U_{k}$ , $Z_{k}$ and $\mathcal{F}_{k}$ are not the same in the ODM and the ODMX settings as they involve $V_{k}$ in the later case. In particular the conditional distribution of $U_{k}$ given $\mathcal{F}_{k}$ takes two very different forms in the ODM and ODMX cases. They can be respectively expressed by $\tilde{G}^{\theta}(X_{k};\cdot)$ and $\tilde{G}^{\theta}((X_{k},V_{(k-r):(k-1)});\cdot)$ where $\tilde{G}^{\theta}$ is a probability kernel on $\mathsf{X}\times\mathcal{U}$ and $(\mathsf{X}\times\mathsf{V}^{r})\times\mathcal{U}$ , resp. For conciseness we use the same notation $\tilde{G}^{\theta}$ for the two cases. They are resp. defined by setting, for all $x\in\mathsf{X}$ , $A\in\mathcal{U}$ and $v\in\mathsf{V}^{r}$ ,

[TABLE]

When the reduced link function is linear we specify Section 2.1 into the following.

Definition \thedefinition ((V)LODM(X)).

We say that an ODM( $p,q$ ) (resp. ODMX( $p,q,r$ )) is a vector linearly observation-driven model of order $(p,q,p^{\prime},q^{\prime})$ , shortened as VLODM $(p,q,p^{\prime},q^{\prime})$ , (resp. VLODMX $(p,q,r,p^{\prime},q^{\prime})$ ) if for some $p^{\prime},q^{\prime}\in\mathbb{Z}_{>0}$ , $\mathsf{X}$ and $\mathrm{U}$ are closed subsets of $\mathbb{R}^{p^{\prime}}$ and $\mathbb{R}^{q^{\prime}}$ , respectively, and, for all $x=x_{0:(p-1)}\in\mathsf{X}^{p}$ , $u=u_{0:(q-1)}\in\mathrm{U}^{q}$ , and $\theta\in\Theta$ ,

[TABLE]

for some mappings $\boldsymbol{\omega}$ , $A_{1:p}$ and $B_{1:q}$ defined on $\Theta$ and valued in $\mathbb{R}^{p^{\prime}}$ , $\left(\mathbb{R}^{p^{\prime}\times p^{\prime}}\right)^{p}$ and $\left(\mathbb{R}^{p^{\prime}\times q^{\prime}}\right)^{q}$ . In the case where $p^{\prime}=q^{\prime}=1$ , the VLODM $(p,q,p^{\prime},q^{\prime})$ (resp. VLODMX $(p,q,r,p^{\prime},q^{\prime})$ ) is simply called a linearly observation-driven model of order $(p,q)$ , shortened as LODM $(p,q)$ (resp. LODMX $(p,q,r)$ ).

2.2. Iterations of the link function

We now introduce iterated versions of the reduced link function $\tilde{\psi}^{\theta}$ . Let $\mathsf{Z}$ be defined by (2.4). We define for any $k\in\mathbb{Z}_{>0}$ and $u_{0:(k-1)}\in\mathrm{U}^{k}$ , the mapping $\tilde{\psi}^{\theta}\langle u_{0:(k-1)}\rangle:\mathsf{Z}\to\mathsf{X}$ through a set of recursive equations of order $(p,q)$ . Namely, for all $n\in\mathbb{Z}_{>0}$ , $u_{0:(k-1)}\in\mathrm{U}^{k}$ and $z=z_{1:(p+q-1)}\in\mathsf{Z}$ , we define

[TABLE]

where the sequence $x_{(-p+1):k}$ is defined by

[TABLE]

In this set of equations the last line is applied recursively so that in fact, for all $j\geq 1$ , $x_{j}$ only depends on $z$ and $u_{0:(j-1)}$ .

The equations in (2.10) define a system with input sequence $u_{(-q+1):(k-1)}$ , initial condition $x_{(-p+1):0}$ and output sequence $x_{1:k}$ . Because the recursion given by the last line of (2.10) involves $p+1$ successive entries of the output and $q$ successive entries of the input, it is useful to define blocks, valued in $\mathsf{Z}=\mathsf{X}^{p}\times\mathrm{U}^{q-1}$ and consider the same recursion applying to such blocks, hence computing $z_{j}$ from $z_{j-1}$ and $u_{j-1}$ . Formally, for all $u\in\mathrm{U}$ , we define $\tilde{\Psi}^{\theta}_{u}\,:\,\mathsf{Z}\to\mathsf{Z}$ by

[TABLE]

Remark 2.

Note in particular that with this notation at hand, and using the admissible variables $U_{k}=\Upsilon(Y_{k})$ for the ODM case or $U_{k}=\Upsilon(Y_{k},V_{k})$ for the ODMX case, and $Z_{k}$ defined by (2.3), the second line of (2.1) and the third line of (2.2) are equivalent to

[TABLE]

We further denote the successive composition of $\tilde{\Psi}^{\theta}_{u_{0}}$ , $\tilde{\Psi}^{\theta}_{u_{1}}$ , …, and $\tilde{\Psi}^{\theta}_{u_{k-1}}$ by

[TABLE]

This recursion is the same as the one for defining $\tilde{\psi}^{\theta}\langle u\rangle$ , except that it is valued in $\mathsf{Z}$ , where as $\tilde{\psi}^{\theta}\langle u\rangle$ is valued in $\mathsf{X}$ . More precisely, denoting, throughout the paper, for all $j\in\{1,\ldots,p+q-1\}$ , by $\Pi_{j}\left(z\right)$ the $j$ -th entry of $z\in\mathsf{Z}$ , we have the following relations between $\tilde{\psi}^{\theta}\langle u\rangle$ and $\tilde{\Psi}^{\theta}\langle u\rangle$ , for all $k\in\mathbb{Z}_{\geq 0}$ and $u\in\mathrm{U}^{k}$ ,

[TABLE]

where, in the second line, we set $u_{j}=\Pi_{p+q+j}\left(z\right)$ for $-q<j\leq-1$ and use the convention $\tilde{\psi}^{\theta}\langle u_{0:j}\rangle(z)=\Pi_{p-j}\left(z\right)$ for $-p<j\leq 0$ .

2.3. Ergodic assumption and some interesting class of parameters

In this contribution, we only consider the case where all processes in the model are ergodic. Namely, we use the following assumption.

(A-1)

For all $\theta\in\Theta$ , there exists a unique stationary solution $\{(X_{k},Y_{k})\,:\,k\in\mathbb{Z}\}$ satisfying (2.1).

In the case of exogenous covariates this assumption is replaced by the following.

(A’-1)

For all $\theta\in\Theta$ , there exists a unique stationary solution $\{(X_{k},Y_{k},V_{k})\,:\,k\in\mathbb{Z}\}$ satisfying (2.2).

This ergodic property is the cornerstone for making statistical inference theory work and we provide simple general conditions in [11] for $p=q=1$ and in [26, 27, Chapter 5] for the case of general order $(p,q)$ .

We now introduce the notation that will allow us to refer to the stationary distribution of the model throughout the paper.

Definition \thedefinition (Stationary distributions $\mathbb{P}^{\theta}$ and

$\tilde{\mathbb{P}}^{\theta}$ ).

We define the distributions $\mathbb{P}^{\theta}$ and $\tilde{\mathbb{P}}^{\theta}$ as follows.

a)

Under (A-1), $\mathbb{P}^{\theta}$ denotes the distribution on $((\mathsf{X}\times\mathsf{Y})^{\mathbb{Z}},(\mathcal{X}\times\mathcal{Y})^{\otimes\mathbb{Z}})$ of the stationary solution of (2.1); Under (A’-1), $\mathbb{P}^{\theta}$ denotes the distribution on $((\mathsf{X}\times\mathsf{Y}\times\mathsf{V})^{\mathbb{Z}},(\mathcal{X}\otimes\mathcal{Y}\otimes\mathcal{V})^{\otimes\mathbb{Z}})$ of the stationary solution of (2.2). 2. b)

Under (A-1), $\tilde{\mathbb{P}}^{\theta}$ denotes the projection of $\mathbb{P}^{\theta}$ on the component $\mathsf{Y}^{\mathbb{Z}}$ ; Under (A’-1), $\tilde{\mathbb{P}}^{\theta}$ denotes the projection of $\mathbb{P}^{\theta}$ on the component $(\mathsf{Y}\times\mathsf{V})^{\mathbb{Z}}$ .

We also use the symbols $\mathbb{E}^{\theta}$ and $\tilde{\mathbb{E}}^{\theta}$ to denote the expectations corresponding to $\mathbb{P}^{\theta}$ and $\tilde{\mathbb{P}}^{\theta}$ , respectively.

To study the identifiability of ergodic ODM’s, we introduce equivalent classes that define a partition of the parameter set in subsets of parameters which share the same distribution of observations. Formally, it reads as follows.

Definition \thedefinition (Equivalent classes for

$\tilde{\mathbb{P}}^{\theta}$ ).

Suppose that (A-1) or (A’-1) holds and define $\tilde{\mathbb{P}}^{\theta}$ as in Section 2.3. For all $\theta,\theta^{\prime}\in\Theta$ , we write $\theta\sim\theta^{\prime}$ if and only if $\tilde{\mathbb{P}}^{\theta}=\tilde{\mathbb{P}}^{\theta^{\prime}}$ . This defines an equivalence relation on the parameter set $\Theta$ and, for any $\theta\in\Theta$ , the equivalence class of $\theta$ is denoted by $[\theta]:=\{\theta^{\prime}\in\Theta:\;\theta^{\prime}\sim\theta\}$ .

Remark 3.

In the context of exogenous variables, that is, under (A’-1), since the distribution of $\{V_{k}\,:\,k\in\mathbb{Z}\}$ under $\tilde{\mathbb{P}}^{\theta}$ does not depend on $\theta$ , $\tilde{\mathbb{P}}^{\theta}=\tilde{\mathbb{P}}^{\theta^{\prime}}$ is equivalent to say that the conditional distribution of $\{Y_{k}\,:\,k\in\mathbb{Z}\}$ given $\{V_{k}\,:\,k\in\mathbb{Z}\}$ is the same under $\tilde{\mathbb{P}}^{\theta}$ and under $\tilde{\mathbb{P}}^{\theta^{\prime}}$ .

Determining the equivalent classes $[\theta]$ for all $\theta\in\Theta$ amounts to solve the identifiability of a parameter under the assumption of a well specified model. Namely, assuming that the distribution of the observations is given by $\tilde{\mathbb{P}}^{\theta_{\star}}$ for some (unknown) parameter ${\theta_{\star}}\in\Theta$ , a parameter $\xi({\theta_{\star}})$ is identifiable if and only if the given mapping $\xi$ is constant over the equivalent class $[{\theta_{\star}}]$ . Without identifiability, the consistency of any estimator of $\xi({\theta_{\star}})$ is not possible. A special case is when $[{\theta_{\star}}]$ reduces to the singleton $\{{\theta_{\star}}\}$ , so that every parameter $\xi({\theta_{\star}})$ is identifiable, in which case the model is said to be identifiable. Obviously, if $\theta$ and ${\theta_{\star}}$ share the same iterative equation (2.1) (or (2.2 with exogenous covariates), that is, if $G^{\theta}=G^{{\theta_{\star}}}$ and $\tilde{\psi}_{u}^{\theta}(x)=\tilde{\psi}_{u}^{\theta_{\star}}(x)$ for all $(u,x)\in\mathrm{U}^{q}\times\mathsf{X}^{p}$ , by uniqueness of the stationary distribution, they must share the same one and in particular we get $\tilde{\mathbb{P}}^{\theta}=\tilde{\mathbb{P}}^{\theta_{\star}}$ . Thus, using the more convenient notation $\tilde{\Psi}$ introduced in (2.11), we have

[TABLE]

We will provide general conditions ensuring that this inclusion becomes an equality, see Section 4.1 below. However it may happen in standard situations that this inclusion is strict, as will be seen in Remark 8 (5). Nevertheless, in all the considered examples, it will be possible to recover an equality by replacing $\mathsf{Z}$ by a more appropriate subset in the left-hand side of (2.16).

As often for ODMs, our results rely on the assumption that, under $\mathbb{P}^{\theta}$ , the hidden variables are measurable with respect to the admissible variables from the past. This is not completely surprising since, using the notation introduced in Section 2.2, iterating the link function, we have that, for all $\theta\in\Theta$ and all $s<t$ in $\mathbb{Z}$ ,

[TABLE]

In particular, taking $t=1$ and letting $s$ decrease backward towards $-\infty$ , we get that, $X_{1}$ is measurable with respect to $\cap_{t\in\mathbb{Z}}\left(\mathcal{F}^{Z}_{t}\vee\mathcal{F}^{U}_{0}\right)$ , where $(\mathcal{F}^{Z}_{t})$ and $(\mathcal{F}^{U}_{t})$ respectively denote the natural filtrations of $\{Z_{n}\,:\,n\in\mathbb{Z}\}$ and $\{U_{n}\,:\,n\in\mathbb{Z}\}$ . To our knowledge, all ODM of interest satisfy in fact the stronger property that $X_{1}$ is measurable with respect to $\mathcal{F}^{U}_{0}$ , which is sometimes called the invertibility condition. This condition is now introduced with some notation for expressing $X_{1}$ as a measurable function of $U_{(-\infty):0}$ .

(A-2)

For all $\theta\in\Theta$ , the measurable function $\tilde{\psi}^{\theta}\langle\cdot\rangle:\mathrm{U}^{\mathbb{Z}_{\leq 0}}\to\mathsf{X}$ satisfies

[TABLE]

Since $\mathbb{P}^{\theta}$ is stationary, (2.17) also implies that, for all $t\in\mathbb{Z}$ , $X_{t+1}=\tilde{\psi}^{\theta}\langle U_{(-\infty):t}\rangle$ $\mathbb{P}^{\theta}\mbox{-a.s.}$ For an ODM (resp. an ODMX), we have $U_{k}=\Upsilon(Y_{k})$ (resp. $U_{k}=\Upsilon(Y_{k},V_{k})$ ). Thus Assumption (A-2) allows us to derive the $X_{t}$ ’s from the $Y_{t}$ ’s (resp. from the $Y_{t}$ ’s and $V_{t}$ ’s) and therefore to rewrite the relationship given through the link function in the second line of (2.1) (resp. in the third line of (2.2)) between these variables in terms of a recursive relationship involving only the $Y_{t}$ ’s (resp. the $Y_{t}$ ’s and the $V_{t}$ ’s). It turns out that Condition (2.17) in (A-2) can be verified using $\tilde{\mathbb{P}}^{\theta}$ only, that is, we do not need $\mathbb{P}^{\theta}$ but only its marginal onto the variable $Y_{k}$ ’s (resp. the variables $Y_{t}$ ’s and the $V_{t}$ ’s), as shown by the following result.

Lemma \thelemma.

Consider an ODM $(p,q)$ satisfying (A-1) with $p,q\in\mathbb{Z}_{>0}$ or an ODMX $(p,q,r)$ satisfying (A’-1) with $p,q,r\in\mathbb{Z}_{>0}$ . Let $\theta\in\Theta$ and consider a measurable function $\tilde{\psi}^{\theta}\langle\cdot\rangle:\mathrm{U}^{\mathbb{Z}_{-}}\to\mathsf{X}$ . Then (2.17) is satisfied if and only if the two following equations hold.

[TABLE]

Proof.

Suppose that (2.17) holds true. Since $\mathbb{P}^{\theta}$ is shift invariant, it can be extended to all time instants $k\in\mathbb{Z}$ , namely,

[TABLE]

But then (2.18) and (2.19) follows from the model equations (2.1) in the case of an ODM or (2.2) in the case of an ODMX.

Suppose now that (2.18) and (2.19) hold true. Since $\mathbb{P}^{\theta}$ is shift invariant, they are extended to all time instants $k\in\mathbb{Z}$ in the form

[TABLE]

Defining $X^{\prime}_{k}=\tilde{\psi}^{\theta}\langle U_{(-\infty):(k-1)}\rangle$ for all $k\in\mathbb{Z}$ , we see that $\{(X^{\prime}_{k},Y_{k})\,:\,k\in\mathbb{Z}\}$ is a stationary sequence satisfying the model equations (2.1) in the ODM case and $\{(X^{\prime}_{k},Y_{k},V_{k})\,:\,k\in\mathbb{Z}\}$ is a stationary sequence satisfying the model equations (2.2) in the ODMX case. By uniqueness of $\mathbb{P}^{\theta}$ assumed in (A-1) and (A’-1), respectively, we get that (2.17) holds. ∎

Now, given ${\theta_{\star}}\in\Theta$ , we introduce the set $\langle{\theta_{\star}}\rangle$ of all parameters $\theta\in\Theta$ whose recursive relationship (2.18) apply to almost all trajectories of $\{U_{n}\,:\,n\in\mathbb{Z}\}$ under the distribution of ${\theta_{\star}}$ .

Definition \thedefinition (Subset

$\langle{\theta_{\star}}\rangle$ ).

Suppose that we are given a measurable function $\tilde{\psi}^{\theta}\langle\cdot\rangle:\mathrm{U}^{\mathbb{Z}_{\leq 0}}\to\mathsf{X}$ . Then, for all ${\theta_{\star}}\in\Theta$ , we denote by $\langle{\theta_{\star}}\rangle$ the set of all parameters $\theta\in\Theta$ satisfying the two following equations

[TABLE]

It is important to note that $\langle{\theta_{\star}}\rangle$ of Section 2.3 depends on the choice of the class of functions $\left\{\tilde{\psi}^{\theta}\langle\cdot\rangle:\theta\in\Theta\right\}$ and that Assumption (A-2) alone is not sufficient to define each $\tilde{\psi}^{\theta}\langle\cdot\rangle$ on the whole set $\mathrm{U}^{\mathbb{Z}_{\leq 0}}$ of trajectories, since Relation (2.17) is only required to hold $\mathbb{P}^{\theta}\mbox{-a.s.}$ . We now provide some Lipschitz condition on the iterates of the link function $\tilde{\psi}^{\theta}$ and a moment condition on $U_{0}$ that allow us to build a natural class of functions $\left\{\tilde{\psi}^{\theta}\langle\cdot\rangle:\theta\in\Theta\right\}$ that satisfies (A-2). Whenever we need some metric on the space $\mathsf{Z}$ , we assume the following.

(A-3)

The $\sigma$ -fields $\mathcal{X}$ and $\mathcal{U}$ are Borel ones, respectively associated to $(\mathsf{X},\boldsymbol{\delta}_{\mathsf{X}})$ and $(\mathrm{U},\boldsymbol{\delta}_{\mathrm{U}})$ , both assumed to be complete and separable metric spaces.

Recall that, for any finite $\mathrm{U}$ -valued sequence $u$ , the mapping $\tilde{\psi}^{\theta}\langle u\rangle$ is defined by (2.9) following the recursion in (2.10). Define, for all $n\in\mathbb{Z}_{>0}$ , the Lipschitz constant for $\tilde{\psi}^{\theta}\langle u\rangle$ , uniform over $u\in\mathrm{U}^{n}$ ,

[TABLE]

where we set, for all $v=v_{1:(p+q-1)}\in\mathsf{Z}$ and $v^{\prime}=v^{\prime}_{1:(p+q-1)}\in\mathsf{Z}$ ,

[TABLE]

We use the following assumptions to define the class of functions $\left\{\tilde{\psi}^{\theta}\langle\cdot\rangle:\theta\in\Theta\right\}$ .

(A-4)

For all $\theta\in\Theta$ , we have $\mathrm{Lip}_{1}^{\theta}<\infty$ and $\mathrm{Lip}_{n}^{\theta}\to 0$ as $n\to\infty$ . 2. (A-5)

There exists $x^{(\text{\tiny{i}})}_{1}\in\mathsf{X}$ and, if $q>1$ , $u^{(\text{\tiny{i}})}_{1}\in\mathrm{U}$ such that the constant vectors $x^{(\text{\tiny{i}})}=(x^{(\text{\tiny{i}})}_{1},\dots,x^{(\text{\tiny{i}})}_{1})\in\mathsf{X}^{p}$ and $u^{(\text{\tiny{i}})}=(u^{(\text{\tiny{i}})}_{1},\dots,u^{(\text{\tiny{i}})}_{1})\in\mathrm{U}^{q-1}$ satisfy, for all ${\theta_{\star}},\theta\in\Theta$ ,

[TABLE]

where we defined, for all $u\in\mathsf{Y}$ ,

[TABLE]

with the convention $\boldsymbol{\delta}_{\mathrm{U}}(u^{(\text{\tiny{i}})}_{1},u)=0$ if $q=1$ . 3. (A-6)

For all $\theta\in\Theta$ and $u\in\mathrm{U}^{q}$ , the reduced link function $\tilde{\psi}^{\theta}_{u}$ is continuous on $\mathsf{X}^{p}$ .

Obviously, under (A-4), for all $\theta\in\Theta$ and $u\in\mathrm{U}^{\mathbb{Z}_{\leq 0}}$ , the asymptotic behavior of $\tilde{\psi}^{\theta}\langle u_{(-n):0}\rangle(z)$ as $n\to\infty$ does not depend on $z\in\mathsf{Z}$ . We can thus denote

[TABLE]

and keep in mind that the initial point $z$ has no influence on these two definitions.

By (2.15), we further have the following result using the definitions in (2.25).

[TABLE]

and $\tilde{\Psi}^{\theta}\langle u\rangle$ and $\tilde{\psi}^{\theta}\langle u\rangle$ are related for all $u\in\mathrm{D}^{\theta}$ through the formulas

[TABLE]

Based on these definitions, we now introduce subsets of $\mathsf{Z}$ of particular interest.

Definition \thedefinition (Set $\mathrm{E}^{\theta}$ ).

If Assumption (A-4) holds, we set, for any $\theta\in\Theta$ ,

[TABLE]

where $\tilde{\Psi}^{\theta}\langle\cdot\rangle$ and $\mathrm{D}^{\theta}$ are defined by (2.26).

Remark 4.

Suppose that, for all $\theta\in\Theta$ , we have $U_{(-\infty):0}\in\mathrm{D}^{\theta}$ , $\tilde{\mathbb{P}}^{\theta}\mbox{-a.s.}$ , and suppose that (A-2) holds for $\tilde{\psi}^{\theta}\langle\cdot\rangle$ as in (2.25). Then, by (2.3) and (2.28), we have $Z_{1}\in\mathrm{E}^{\theta}$ , $\mathbb{P}^{\theta}\mbox{-a.s.}$ Since $\mathbb{P}^{\theta}$ is shift-invariant, we get that $\left\{Z_{k}:k\in\mathbb{Z}\right\}$ takes its values in $\mathrm{E}^{\theta}$ , $\mathbb{P}^{\theta}\mbox{-a.s.}$ This is why the set $\mathrm{E}^{\theta}$ will be of interest in the following.

The following result is proved in Section 6.1.

Lemma \thelemma.

Consider an ODM $(p,q)$ satisfying (A-1) with $p,q\in\mathbb{Z}_{>0}$ or an ODMX $(p,q,r)$ satisfying (A’-1) with $p,q,r\in\mathbb{Z}_{>0}$ . Suppose that (A-3), (A-4) and (A-5) hold. Then, for all $\theta,{\theta_{\star}}\in\Theta$ , we have $U_{(-\infty):0}\in\mathrm{D}^{\theta}$ , $\tilde{\mathbb{P}}^{{\theta_{\star}}}\mbox{-a.s.}$ , (A-2) holds and, setting $\mathrm{E}^{\theta_{\star}}$ as in Section 2.3, we have

[TABLE]

If moreover (A-6) is assumed, then (2.21) holds for all $\theta\in\Theta$ . Consequently, the set $\langle{\theta_{\star}}\rangle$ in Section 2.3 can be expressed as

[TABLE]

Remark 5.

The invertibility Assumption (A-2) is essential for deriving the identifiability class $[{\theta_{\star}}]$ using the set $\langle{\theta_{\star}}\rangle$ . Section 2.3 can be used to prove it in all the examples that are considered hereafter. Indeed as will be checked in Section 5, all the considered examples satify the following facts:

(1)

The sets $\mathsf{X}$ and $\mathrm{U}$ are closed subsets of finite dimensional normed spaces and (A-3) follows. 2. (2)

Assumption (A-4) is weaker than what is needed for proving the ergodicity assumption (A-1). Consider for instance the classical GARCH(1,1) model defined by setting $\Upsilon(y)=y^{2}$ , $\tilde{\psi}^{\theta}_{u}(x)=\omega+ax+bu$ and $G^{\theta}(x,\cdot)=\mathbb{P}(x\varepsilon\in\cdot)$ where $\varepsilon$ is centered with variance 1. Then it is easily seen that (A-4) is equivalent to $a<1$ . On the other hand, the Lyapunov condition to get (A-1) reads $\mathbb{E}\log(b\epsilon^{2}+a)<0$ , which implies $a<1$ . 3. (3)

The moment condition (A-5) is implied by $\mathbb{E}^{\theta_{\star}}\left[\log^{+}(|Y_{0}|)\right]<\infty$ , where $|\cdot|$ is some norm, and this condition holds as a byproduct of the proof of (A-1) (which often imply $\mathbb{E}^{\theta_{\star}}\left[|Y_{0}|^{s}\right]$ for some $s>0$ ). 4. (4)

One can readily checks (A-6).

Note also that the set in the left-hand side of (2.30) contains the set in the left-hand side of (2.16). In all our examples, the assumptions of 4.1 below will be shown to hold, implying that the inclusion in (2.30) is in fact an equality. In some of these examples, however, the inclusion in (2.16) is strict, showing that the sets in the left-hand sides of (2.16) and (2.30) may happen to be different.

3. Examples

We give a non-exhaustive list of possible examples related to the previous definitions and for which our results apply, as will be shown in Section 5.

3.1. Standard LODMs

Many models can be considered as an LODM by choosing an appropriate admissible mapping $\Upsilon$ .

GARCH. The standard GARCH $(p,q)$ model is a special case of LODM $(p,q)$ , in which case $\mathsf{X}=\mathbb{R}_{\geq 0}$ , $\mathsf{Y}=\mathbb{R}$ , $\Upsilon(y)=y^{2}$ , and $G^{\theta}(x,\cdot)$ is a centered distribution with variance $x$ , most commonly the normal distribution.

INGARCH. The standard Poisson integer-valued GARCH (INGARCH, see e.g. [12]) obviously is an LODM $(p,q)$ with $\mathsf{X}=\mathbb{R}_{\geq 0}$ , $\mathsf{Y}=\mathbb{Z}_{\geq 0}$ and $G^{\theta}(x,\cdot)$ is the Poisson distribution with mean $x$ .

Extensions of INGARCH. Many extensions of the INGARCH model simply consist in extending the Poisson distribution to more general ones: the NBIN-GARCH model of [30], the COM-Poisson INGARCH model of [31], the zero-inflated Poisson GARCH of [32], or the mixed-Poisson integer GARCH (MPINGARCH) of [25], among others. Often for these extensions, an extra-parameter is used to define the distribution $G^{\theta}(x,\cdot)$ , in which case this extra parameter can be taken either as known, in which case $G^{\theta}$ does not depend on $\theta$ , or as unknown, in which case $G^{\theta}$ only depends on a subparamater of $\theta$ . Some integer valued observation driven models require using a non-identity admissible mapping in order to be seen as an LODM. For instance, the log-linear Poisson Garch model of [13] is an LODM $(p,q)$ by taking $\Upsilon(y)=\ln(1+y)$ , and $G^{\theta}(x,\cdot)$ as the Poisson distribution with mean $\mathrm{e}^{x}$ .

All the above examples are LODMs with a similar parametrization of the linear link function. In fact they only differ through the admissible mapping $\Upsilon$ or the observation kernel $G^{\theta}$ . We assemble them using the following definition.

Definition \thedefinition (Standard LODM (with unknown observation kernel)).

An LODM( $p,q$ ) of Section 2.1 is said to be standard if $\theta=(\omega,a_{1:p},b_{1:q})\in\Theta\subset\mathbb{R}^{1+p+q}$ with $\boldsymbol{\omega}(\theta)=\omega$ , $A_{k}(\theta)=a_{k}$ for all $1\leq k\leq p$ and $B_{k}(\theta)=b_{k}$ for all $1\leq k\leq q$ , and $G^{\theta}$ does not depend on $\theta$ , in which case we denote it by $G$ . It is said to be standard with unknown observation kernel if the same holds with $\theta=(\vartheta,\varphi)\in\Theta\subset\mathbb{R}^{1+p+q}\times\Phi$ where $\vartheta=(\omega,a_{1:p},b_{1:q})$ and $\Phi$ is some parameter set, and $G^{\theta}$ only depends on $\varphi$ , in which case we denote it by $G^{\varphi}$ .

In this definition the parameter $\varphi$ is used in the case where the observation kernel depends on an unknown extra parameter, as considered in [25] for the class of MPINGARCH( $p,q$ ) models which include the NBIN GARCH model. A necessary and sufficient condition for standard LODMs with known or unknown observation kernel is provided in Theorem 2 below and applies to all the examples listed in this section.

3.2. A bivariate example

Let us extend Section 3.1 to the vector case as follows.

Definition \thedefinition (Standard VLODM (with unknown observation kernel)).

A VLODM( $p,q,p^{\prime},q^{\prime}$ ) is said to be standard if $\theta=(\boldsymbol{\omega},A_{1:p},B_{1:q})\in\Theta\subset\mathbb{R}^{p^{\prime}}\times\left(\mathbb{R}^{p^{\prime}\times p^{\prime}}\right)^{p}\times\left(\mathbb{R}^{p^{\prime}\times q^{\prime}}\right)^{q}$ with $\boldsymbol{\omega}(\theta)=\omega$ , $A_{k}(\theta)=A_{k}$ for all $1\leq k\leq p$ and $B_{k}(\theta)=B_{k}$ for all $1\leq k\leq q$ , and $G^{\theta}$ does not depend on $\theta$ , in which case we denote it by $G$ . It is said to be standard with unknown observation kernel if the same holds with $\theta=(\vartheta,\varphi)\in\Theta\subset\mathbb{R}^{p^{\prime}}\times\left(\mathbb{R}^{p^{\prime}\times p^{\prime}}\right)^{p}\times\left(\mathbb{R}^{p^{\prime}\times q^{\prime}}\right)^{q}\times\Phi$ where $\vartheta=(\boldsymbol{\omega},A_{1:p},B_{1:q})$ and $\Phi$ is some parameter set, and $G^{\theta}$ only depends on $\varphi$ , in which case we denote it by $G^{\varphi}$ .

Then the bivariate integer valued GARCH model of [9] is a standard VLODM( $1,1,2,2$ ) with unknown observation kernel defined for all $\varphi\in\Phi=[-\overline{\varphi},\overline{\varphi}]$ (where $\overline{\varphi}>0$ is some constant), $(x_{1},x_{2})\in\mathbb{R}_{>0}$ and $(y_{1},y_{2})\in\mathbb{Z}_{\geq 0}$ , by

[TABLE]

where $c=1-1/\mathrm{e}$ . Since we have $p=q=1$ in this example, we simply denote $\theta=(\boldsymbol{\omega},A,B,\varphi)\subset\mathbb{R}^{2}\times\mathbb{R}^{2\times 2}\times\mathbb{R}^{2\times 2}\times[-\overline{\varphi},\overline{\varphi}]$ .

3.3. Non-linear GARCH

The non-linear GARCH model of [17] is an ODM( $p,q$ ) with

[TABLE]

where $\eta$ is a real valued random variable. Two cases are considered in [17] :

Case 1)

If the exponent $\delta$ is known, we set $\theta=(\omega,a_{1:p},\mathbf{b}_{1:q})$ with $\mathbf{b}_{k}=\begin{bmatrix}b_{k}(1)&b_{k}(2)\end{bmatrix}$ for $k=1,\dots,q$ , and $\Upsilon(y)=((y^{+})^{\delta},(y^{-})^{\delta})$ , in which case we have a standard VLODM $(p,q,1,2)$ of Section 3.2 with known observation kernel and with the parameters $A_{k}$ denoted by $a_{k}$ for $k=1,\dots,p$ and the parameters $B_{k}$ denoted by $\mathbf{b}_{k}$ for $k=1,\dots,q$ . 2. Case 2)

If the exponent $\delta$ is unknown, we set $\theta=(\omega,a_{1:p},\mathbf{b}_{1:q},\delta)$ and $\Upsilon(y)=(y^{+},y^{-})$ , in which case $\delta$ must be included in the definition of $\tilde{\psi}^{\theta}$ .

To our best knowledge this kind of model have not be extended to the case of (signed) integer valued time series.

3.4. The SETPAR model

Other non-linear ODM’s that cannot be cast into an LODM or a VLODM can be found in [6]. We consider here the self-excited threshold Poisson autoregression (SETPAR) model originally studied in [28], which is an ODM(1,1), integer valued ( $\mathsf{Y}=\mathbb{Z}_{\geq 0}$ ), with link function defined for all $\theta=(\omega_{1},\omega_{2},a_{1},a_{2},b_{1},b_{2},r)\in\Theta\subset\mathbb{R}_{\geq 0}^{6}\times\mathbb{Z}_{\geq 0}$ by

[TABLE]

with $G^{\theta}(x,\cdot)$ being the usual Poisson distribution with mean $x$ .

3.5. The PARX model

Our last example is the Poisson autoregression with exogenous covariates (PARX) model of [1]. The PARX model is similar to the standard INGARCH( $p,q$ ) model above but with additional exogenous variables entering into the link function for generating the hidden variables. The exogenous variables are assumed to satisfy some Markov dynamic of order 1 (see [1, Assumption 1]). Thus it is an ODMX( $p,q,1$ ). Eq. (1) in [1] corresponds to setting our $G^{\theta}(x,\cdot)$ as the Poisson distribution with mean $x$ . Eq. (2) in [1] corresponds to setting for all $x=x_{0:(p-1)}\in\mathbb{R}^{p}$ and $u=(y_{0:(q-1)},v)\in\mathbb{R}^{q}\times\mathsf{V}$ ,

[TABLE]

where $f(\cdot,\gamma):\mathsf{V}\to\mathbb{R}_{\geq 0}$ is a known function and $\theta=(\omega,a_{1},\dots,a_{p},b_{1},\dots,b_{q},\gamma)$ is the unknown parameter of the model. Note that our $Y_{k}$ , $X_{k}$ , $V_{k}$ , $a_{1:p}$ and $b_{1:q}$ correspond to their $y_{k}$ , $\lambda_{k}$ , $x_{k}$ , $\beta_{1:q}$ and $\alpha_{1:p}$ , respectively. Identifiability is considered in [1] by specifying $\gamma$ as $\gamma=\gamma_{1:d}\in\mathbb{R}_{\geq 0}^{d}$ for some positive integer $d$ (which corresponds to $d_{x}$ in [1]) and $f(v,\gamma)$ as being of the form

[TABLE]

for some known functions $f_{1},\dots,f_{d}:\mathsf{V}\to\mathbb{R}_{\geq 0}$ . It is in fact imposed in [1] that $v=v_{1:d}\in\mathbb{R}^{d}=\mathsf{V}$ and $f_{i}(v)$ actually is a function of $v_{i}$ for each $i\in\{1,\dots,d\}$ but this constraint can be dropped for achieving wider generality without additional theoretical difficulties. The specific form of $f(v,\gamma)$ in (3.4) amounts in our setting to specify the previous ODMX( $p,q,1)$ with reduced link function as in (3.3) to a VLODMX( $p,q,1,1,d+1$ ) with $\Upsilon(y,v)=(y,f_{1}(v),\dots,f_{d}(v))\in\mathrm{U}=\mathbb{R}^{1+d}$ , $A_{k}(\theta)=a_{k}$ for $k=1,\dots,p$ , $B_{1}(\theta)=\begin{bmatrix}b_{1}&\gamma_{1}&\dots&\gamma_{d}\end{bmatrix}$ and $B_{k}(\theta)=\begin{bmatrix}b_{k}&0&\dots&0\end{bmatrix}$ for $k=2,\dots,q$ . Then $\theta=(\omega,a_{1},\dots,a_{p},b_{1},\dots,b_{q},\gamma)$ with $\gamma=\gamma_{1:d}\in\mathbb{R}_{\geq 0}^{d}$ and it follows that $\Theta$ is a subset of $\mathbb{R}_{\geq 0}^{1+p+q+d}$ .

4. Main results

4.1. General setting

To investigate the identifiability of the model, we first introduce an assumption which says how much can be identified from a single observation of the conditional distribution $G^{\theta}(x,\cdot)$ .

(B-1)

For all ${\theta_{\star}}\in\Theta$ there exists $[{\theta_{\star}}]_{G}\subset\Theta$ such that, for all $\theta\in\Theta$ and $x,x^{\prime}\in\mathsf{X}$ ,

[TABLE]

It can be convenient to write the parameters as $\theta=(\vartheta,\varphi)$ so that $G^{\theta}$ only depends on $\varphi$ , hence can be denoted by $G^{\varphi}$ , and the link function $\psi^{\theta}$ only depends on $\vartheta$ , hence can be denoted by $\psi^{\vartheta}$ . In this case, the “if” in (B-1) holds by setting $[{\theta_{\star}}]_{G}=\left\{(\vartheta,\varphi)\in\Theta:\varphi=\varphi_{\star}\right\}$ for ${\theta_{\star}}=(\vartheta_{\star},\varphi_{\star})$ , and the “only if” in (B-1) says that $(\varphi,x)\mapsto G^{\varphi}(x,\cdot)$ is one-to-one. In many examples $G^{\theta}$ does not depend on $\theta$ at all, in which case $[{\theta_{\star}}]_{G}=\Theta$ . See (SL’-3) below in Section 5.1 for such a case.

Our approach to establish identifiability is given by the following general result.

Proposition \theproposition.

Consider an ODM $(p,q)$ satisfying (A-1) with $p,q\in\mathbb{Z}_{>0}$ or an ODMX $(p,q,r)$ satisfying (A’-1) with $p,q,r\in\mathbb{Z}_{>0}$ . Let $\left\{\tilde{\psi}^{\theta}\langle\cdot\rangle:\theta\in\Theta\right\}$ be a class of $\mathrm{U}^{\mathbb{Z}_{\leq 0}}\to\mathsf{X}$ -measurable functions satisfying (A-2). Suppose moreover that (B-1) holds. Then, for all ${\theta_{\star}}\in\Theta$ . we have

[TABLE]

where $\langle{\theta_{\star}}\rangle$ , $[{\theta_{\star}}]_{G}$ and $[{\theta_{\star}}]$ are respectively defined in Section 2.3, Assumption (B-1) and Section 2.3.

The proof is postponed to Section 6.2 for convenience. We now derive the main result of this section, which provides sufficient conditions in order to fully describe the set $\langle{\theta_{\star}}\rangle$ . To this end, we introduce the following assumption, in which, by saying that a probability measure $\mu$ on $(\mathrm{U},\mathcal{U})$ is non-degenerate with respect to the class $\mathcal{C}\subset\mathcal{U}$ , we mean that, for any $A\in\mathcal{C}$ , $\mu(A)=1$ can only be true if $A=\mathrm{U}$ .

(A-7)

For all $\theta\in\Theta$ and $x\in\mathsf{X}$ , the measure $\tilde{G}^{\theta}(x;\cdot)$ defined by (2.6) on $(\mathrm{U},\mathcal{U})$ is non-degenerate with respect to the class $\mathcal{C}^{\theta}$ ,

where $\mathcal{C}^{\theta}$ denotes the class containing all sets $A\in\mathcal{U}$ for which there exist $\theta,\theta^{\prime}\in\Theta$ , $z\in\mathsf{Z}$ , $k,l\in\mathbb{Z}_{\geq 0}$ , and $v,w\in\mathrm{U}^{k}\times\mathrm{U}^{l}$ such that

[TABLE]

Remark 6.

The non-degenerate assumption (A-7) is easy to check in the two following cases.

(1)

If for all $\theta\in\Theta$ , $x\in\mathsf{X}$ and $u\in\mathrm{U}$ , we have $\tilde{G}^{\theta}(x,\{u\})>0$ , then for any set $A\in\mathcal{U}$ we have $\tilde{G}^{\theta}(x,A)=1$ if and only if $A=\mathrm{U}$ . Thus (A-7) is immediately satisfied. 2. (2)

In the VLODM case, that is, with reduced link function given by (2.8), we immediately see that $\mathcal{C}^{\theta}$ only contains affine subsets (being the null space of an affine function). Hence we only need to require that $\tilde{G}^{\theta}(x;\cdot)$ does not have full measure on affine hyperplanes to ensure that it is non-degenerate with respect to the class $\mathcal{C}^{\theta}$ .

In the case of an ODMX, the definition of $\tilde{G}^{\theta}$ is different, see Remark 1 (5), Assumption (A-7) has to be adapted into the following.

(A’-7)

For all $\theta\in\Theta$ and $w\in\mathsf{X}\times\mathsf{V}^{r}$ , the measure $\tilde{G}^{\theta}(w;\cdot)$ defined by (2.7) on $(\mathrm{U},\mathcal{U})$ is non-degenerate with respect to the class $\mathcal{C}^{\theta}$ .

We have the following result.

Theorem 1.

Consider an ODM $(p,q)$ satisfying (A-1) and (A-7) with $p,q\in\mathbb{Z}_{>0}$ or an ODMX $(p,q,r)$ satisfying (A’-1) and (A’-7) with $p,q,r\in\mathbb{Z}_{>0}$ . Assume that (A-3)–(A-6) hold. For all $\theta\in\Theta$ , define $\mathrm{D}^{\theta}$ and $\tilde{\psi}^{\theta}\langle\cdot\rangle$ by (2.25). Then (A-2) holds and we have, for all ${\theta_{\star}}\in\Theta$ ,

[TABLE]

where $\langle{\theta_{\star}}\rangle$ and $\mathrm{E}^{\theta_{\star}}$ are as in Section 2.3 and Section 2.3.

Proof.

The fact that (A-2) holds for the given choice of $\tilde{\psi}^{\theta}\langle\cdot\rangle$ follows from Section 2.3.

Let us now take ${\theta_{\star}}\in\Theta$ and $\theta\in\langle{\theta_{\star}}\rangle$ and show that $\theta$ belongs to the right-hand side of (4.1). We prove this in the case of an ODMX satisfying (A’-1).(The case of an ODM is readily obtained by removing the variables $V_{k}$ ’s in the reasoning). By (2.17), (2.20) and (2.21), and since $\mathbb{P}^{\theta_{\star}}$ is stationary, we have, for all $t\in\mathbb{Z}$ ,

[TABLE]

Using (2.3) and the notation introduced in Section 2.2, we get that, for any $n\in\mathbb{Z}_{\geq 0}$ ,

[TABLE]

We now show that this implies

(Hk)

For all $u\in\mathrm{U}^{k}$ , we have $\displaystyle\tilde{\psi}^{{\theta_{\star}}}\langle(U_{0:(n-k)},u)\rangle(Z_{0})=\tilde{\psi}^{\theta}\langle(U_{0:(n-k)},u)\rangle(Z_{0})$ $\mathbb{P}^{\theta_{\star}}\mbox{-a.s.}$

by iterative reasoning on $k=0,\dots,n+1$ . First observe that (4.2) corresponds to H0 (since $u$ is an empty sequence). Now assume that Hk holds for some $k=0,\dots,n$ . Then, for any $v\in\mathrm{U}^{k}$ , the set

[TABLE]

has probability 1 under the $\mathbb{P}^{\theta_{\star}}$ -conditional probability of $U_{n-k}$ given $Z_{0},U_{0:(n-k-1)},V_{(-\infty):(n-k-1)}$ . This conditional probability is $\tilde{G}^{{\theta_{\star}}}((X_{n-k},V_{(n-k-r):(n-k-1)});\cdot)$ defined by (2.7). By (A’-7) and since, given $Z_{0},U_{0:(n-k-1)},V_{(-\infty):(n-k-1)}$ , we have $A\in\mathcal{C}^{\theta}$ , $\mathbb{P}^{\theta_{\star}}\mbox{-a.s.}$ , we obtain that Hk+1 is true. Reasoning by induction, this leads to Hn+1, and finally, we get that

(H)

there exists $z\in\mathsf{Z}$ such that for all $n\in\mathbb{Z}_{\geq 0}$ and all $u\in\mathrm{U}^{n+1}$ , $\tilde{\psi}^{{\theta_{\star}}}\langle u\rangle(z)=\tilde{\psi}^{\theta}\langle u\rangle(z)$ .

Now let $(z^{(\text{\tiny{i}})},u_{1})\in\mathrm{E}^{\theta_{\star}}\times\mathrm{U}$ . By definition of $\mathrm{E}^{\theta_{\star}}$ , there exists $u\in\mathrm{U}^{\mathbb{Z}_{\leq 0}}$ , such that $z^{(\text{\tiny{i}})}=\tilde{\Psi}^{{\theta_{\star}}}\langle u\rangle$ . Now, for all $n\geq p$ , we have

[TABLE]

where we chose $z\in\mathsf{Z}$ in order to apply Assertion (H) in the second equality. On the other hand, by (2.26), we have

[TABLE]

where we again used that $z$ were chosen as in (H) in the last equality. With (A-6) and the previous display we obtain $\tilde{\Psi}^{\theta}_{u_{1}}(z^{(\text{\tiny{i}})})=\tilde{\Psi}^{\theta_{\star}}_{u_{1}}(z^{(\text{\tiny{i}})})$ . This is true for an arbitrary $(z^{(\text{\tiny{i}})},u_{1})\in\mathrm{E}^{\theta_{\star}}\times\mathrm{U}$ ; hence, we have obtained that the left-hand side of (4.1) is included in its right-hand side.

We now prove the opposite inclusion. Let ${\theta_{\star}},\theta\in\Theta$ such that

[TABLE]

Let $v\in\mathrm{U}^{\mathbb{Z}_{\leq 0}}$ . Take an arbitrary $z^{(\text{\tiny{i}})}\in\mathrm{E}^{\theta_{\star}}$ . Then there exists $w\in\mathrm{D}^{\theta}$ such that $z^{(\text{\tiny{i}})}=\tilde{\Psi}^{\theta}\langle w\rangle$ . For all $n\in\mathbb{Z}_{>0}$ and $k=1,\dots,n$ , we get that

[TABLE]

Applying (4.3) recursively in $k$ , we get that, for any $n\in\mathbb{Z}_{\geq 0}$ ,

[TABLE]

Hence, $\mathrm{D}^{\theta}=\mathrm{D}^{\theta_{\star}}$ and by (2.26) and (2.27), we get that for all $v\in\mathrm{D}^{\theta}=\mathrm{D}^{\theta_{\star}}$ , $\tilde{\psi}^{\theta}\langle v\rangle=\tilde{\psi}^{{\theta_{\star}}}\langle v\rangle$ . By Section 2.3 we have $U_{(-\infty):0}\in\mathrm{D}^{\theta_{\star}}$ , $\mathbb{P}^{\theta_{\star}}\mbox{-a.s.}$ , and using (2.31), we get that $\theta\in\langle{\theta_{\star}}\rangle$ , which concludes the proof. ∎

Note that in (4.1), as in the left-hand side of (2.30), the functions $\tilde{\Psi}^{\theta}_{u}$ and $\tilde{\Psi}^{{\theta_{\star}}}_{u}$ are only required to coincide on $\mathrm{E}^{\theta_{\star}}$ whereas in the left-hand side of (2.16), they coincide on the whole set $\mathsf{Z}$ . In some cases, we can prove that the two conditions are the same, so that Section 4.1 and Theorem 1 allow us to conclude that the inclusion in (2.16) is in fact an equality, as in the following result.

Corollary \thecorollary.

Suppose that the assumptions of Theorem 1 and (B-1) hold. Let ${\theta_{\star}}\in\Theta$ . Then the inclusion in (2.30) is an equality. Suppose moreover that ${\theta_{\star}}$ satisfies the following additional assumption.

(A-8)

For all $\theta\in\Theta$ , $u\in\mathrm{U}$ and $z\in\mathsf{Z}$ , if $\tilde{\Psi}^{\theta}_{u}$ and $\tilde{\Psi}^{\theta_{\star}}_{u}$ coincide on the set $\mathrm{E}^{\theta_{\star}}(z):=\left\{\tilde{\Psi}^{{\theta_{\star}}}\langle v\rangle(z):n\in\mathbb{Z}_{\geq 0},\,v\in\mathrm{U}^{n}\right\}$ , then they also coincide on $\mathsf{Z}$ .

Then the inclusion in (2.16) is an equality.

Proof.

Applying Section 4.1 and Theorem 1, we get that

[TABLE]

Observing that, by (B-1), $\theta\in[{\theta_{\star}}]_{G}$ is equivalent to have $G^{\theta}=G^{\theta_{\star}}$ , we get that the inclusion in (2.30) is an equality. To prove the second assertion of the corollary, we only need to check that, under (A-8), for all $\theta\in\Theta$ and $u\in\mathrm{U}$ , if $\tilde{\Psi}^{\theta}_{u}$ and $\tilde{\Psi}^{{\theta_{\star}}}_{u}$ coincide on $\mathrm{E}^{\theta_{\star}}$ , they must also coincide on $\mathsf{Z}$ . It suffices to show that there exists $z\in\mathsf{Z}$ such that $\mathrm{E}^{\theta_{\star}}(z)\subset\mathrm{E}^{\theta_{\star}}$ . This inclusion is true if $z\in\mathrm{E}^{\theta_{\star}}$ and we conclude by observing that $\mathrm{E}^{\theta_{\star}}$ is not empty since it contains $Z_{1}$ , $\mathbb{P}^{\theta_{\star}}\mbox{-a.s.}$ , as a consequence of Remark 4. ∎

Remark 7.

A simple case where (A-8) in Section 4.1 is easy to check is when $p=q=1$ , so that $\mathsf{Z}=\mathsf{X}$ and $\tilde{\Psi}^{{\theta_{\star}}}\langle v\rangle(z)=\tilde{\psi}^{\theta_{\star}}_{v}(z)$ for all $v\in\mathrm{U}$ and $z\in\mathsf{X}$ . See the proof of Theorem 4 for a specific example. However it may happen that (A-8) is not satisfied as will be seen in Remark 8 (5). In the linear case, we will characterize $\langle{\theta_{\star}}\rangle$ in Section 4.2 without relying on (A-8).

4.2. Vector linear setting

We now consider a VLODM( $p,q,p^{\prime},q^{\prime}$ ) or a VLODMX( $p,q,p^{\prime},q^{\prime},r$ ), that is, we assume the reduced link function to be of the form (2.8). We set in this case $\boldsymbol{\delta}_{\mathsf{X}}(x,x^{\prime})=|x-x^{\prime}|$ (resp. $\boldsymbol{\delta}_{\mathrm{U}}(u,u^{\prime})=|u-u^{\prime}|$ ) where $|\cdot|$ denotes an arbitrary norm in $\mathbb{R}^{p^{\prime}}$ (resp. $\mathbb{R}^{q^{\prime}}$ ). The general conditions reduce to the following set of conditions.

(L-1)

For all $\theta\in\Theta$ , we have that $\mathrm{I}_{p^{\prime}}-\sum_{k=1}^{p}A_{k}(\theta)z^{k}$ is invertible for all $z\in\mathbb{C}$ with $|z|\leq 1$ ,

where $\mathrm{I}_{p^{\prime}}$ denotes the identity matrix of order $p^{\prime}$ .

(L-2)

For all $\theta\in\Theta$ and $x\in\mathsf{X}$ , the measure $\tilde{G}^{\theta}(x;\cdot)$ defined on $(\mathrm{U},\mathcal{U})$ by (2.6) is non-degenerate in the following sense : there is no affine hyperplane $A\subset\mathbb{R}^{q^{\prime}}$ such that $\tilde{G}^{\theta}(x;A)=1$ .

Note that, if $q^{\prime}=1$ , affine hyperplanes are singletons, hence (L-2) simply means that, for all $x$ and $\theta$ , $\tilde{G}^{\theta}(x;\cdot)$ does not reduce to a unit mass concentrated on a single point. In the case of a VLODMX, we replace (L-2) by the following.

(L’-2)

For all $\theta\in\Theta$ and $w\in\mathsf{X}\times\mathsf{V}^{r}$ , the measure $\tilde{G}^{\theta}(w;\cdot)$ defined on $(\mathrm{U},\mathcal{U})$ by (2.7) is non-degenerate in the following sense : there is no affine hyperplane $A\subset\mathbb{R}^{q^{\prime}}$ such that $\tilde{G}^{\theta}(w;A)=1$ .

Finally the moment condition (A-5) simplifies in the vector linear case to

(L-3)

The invariant probability measure of Section 2.3 satisfies, for all $\theta\in\Theta$ ,

[TABLE]

We have the following result, whose proof is postponed to Section 6.3 for convenience, that relates this set of assumptions to the general ones.

Lemma \thelemma.

Consider the vector linear setting where (2.8) holds, and $\mathsf{X}$ and $\mathsf{Y}$ are closed subset of $\mathbb{R}^{p^{\prime}}$ and $\mathbb{R}^{q^{\prime}}$ , respectively, with $\boldsymbol{\delta}_{\mathsf{X}}$ and $\boldsymbol{\delta}_{\mathrm{U}}$ being the metrics induced by norms on these spaces. The following assertions hold.

(i)

Assumption (A-3) holds. 2. (ii)

Assumption (A-4) is equivalent to (L-1). 3. (iii)

Assumption (L-3) implies (A-5) for any $x^{(\text{\tiny{i}})}_{1}\in\mathsf{X}$ , and any $u^{(\text{\tiny{i}})}_{1}\in\mathrm{U}$ . 4. (iv)

Assumption (A-6) holds. 5. (v)

Assumption (L-2) implies (A-7). 6. (vi)

Assumption (L’-2) implies (A’-7).

As a consequence, the assumptions of Theorem 1 are implied by (A-1), (L-1), (L-2) and (L-3) in the VLODM case and by (A’-1), (L-1), (L’-2) and (L-3) in the VLODMX case.

We now provides a simple characterization of $\langle{\theta_{\star}}\rangle$ in (4.1) in the linear case. The proof of the following result can be found in Section 6.4.

Lemma \thelemma.

Suppose that $\{0\}\subsetneq\mathrm{U}$ and that (L-1) holds, and let $\mathrm{E}^{\theta}$ be as in Section 2.3. For all $\theta\in\Theta$ , define $\mathbf{R}(\cdot;\theta)$ as the rational matrix

[TABLE]

which is well defined on $z\in\mathbb{C}$ except for at most finitely many $z$ ’s. Then, for all $\theta,{\theta_{\star}}\in\Theta$ , the two following assertions are equivalent.

(i)

We have $\displaystyle\tilde{\Psi}^{\theta}_{u}(z)=\tilde{\Psi}^{{\theta_{\star}}}_{u}(z)$ for all $z\in\mathrm{E}^{\theta_{\star}}$ and $u\in\mathrm{U}$ . 2. (ii)

The two following identities hold

[TABLE]

The identification of a parameter $\theta$ based on the equation (4.7) is similar to the identifiability of a vector auto-regressive moving average or order $p,q$ (VARMA( $p,q$ )) model with AR matrices $A_{1:p}(\theta)$ and MA matrices $B_{1:q}(\theta)$ . Indeed, in such a model the spectral density matrix takes the form

[TABLE]

where $\Sigma$ is the covariance matrix of the noise. We refer to [18] where identifiable parametrization of ARMA models are discussed. Below we provide an important related result related to this general issue. Let $p,q,p^{\prime},q^{\prime}$ be positive integers. For all $A_{1:p}\in(\mathbb{R}^{p^{\prime}\times p^{\prime}})^{p}$ and $B_{1:q}\in(\mathbb{R}^{p^{\prime}\times q^{\prime}})^{q}$ , let us define the polynomial matrices respectively valued in $\mathbb{R}^{p^{\prime}\times p^{\prime}}$ and $\mathbb{R}^{p^{\prime}\times q^{\prime}}$

[TABLE]

Note that, for all $A_{1:p}\in(\mathbb{R}^{p^{\prime}\times p^{\prime}})^{p}$ , $\mathrm{P}_{p}(z;A_{1:p})$ in (4.8) must be invertible for $|z|$ large enough (since then $\mathrm{I}_{p^{\prime}}z^{p}$ dominates). When a polynomial matrix is invertible for at least one $z\in\mathbb{C}$ , then it is invertible for all $z\in\mathbb{C}$ , except at most a finite number of them. It is then said to be non-singular. Thus, for all $A_{1:p}\in(\mathbb{R}^{p^{\prime}\times p^{\prime}})^{p}$ and $B_{1:q}\in(\mathbb{R}^{p^{\prime}\times q^{\prime}})^{q}$ , we can define the rational matrix $\mathrm{P}_{p}(z;A_{1:p})^{-1}\mathrm{Q}_{q}(z;B_{1:q})$ , which is well defined for all $z\in\mathbb{C}$ , except at most a finite number of them.

Lemma \thelemma.

Let $p,q,p^{\prime},q^{\prime}$ be positive integers. Then, for any $A^{\star}_{1:p}\in(\mathbb{R}^{p^{\prime}\times p^{\prime}})^{p}$ and $B^{\star}_{1:q}\in(\mathbb{R}^{p^{\prime}\times q^{\prime}})^{q}$ , the two following assertion holds.

(i)

Suppose that $\mathrm{P}_{p}(\cdot;A^{\star}_{1:p})$ and $\mathrm{Q}_{q}(\cdot;B^{\star}_{1:q})$ are left coprime. Then, for all $A_{1:p}\in(\mathbb{R}^{p^{\prime}\times p^{\prime}})^{p}$ and $B_{1:q}\in(\mathbb{R}^{p^{\prime}\times q^{\prime}})^{q}$ , we have

[TABLE]

if and only if $A_{1:p}=A^{\star}_{1:p}$ and $B_{1:p}=B^{\star}_{1:p}$ . 2. (ii)

Suppose that $\mathrm{P}_{p}(\cdot;A^{\star}_{1:p})$ and $\mathrm{Q}_{q}(\cdot;B^{\star}_{1:q})$ are not left coprime. Then, there exist $\tilde{A}_{1:p}\in(\mathbb{R}^{p^{\prime}\times p^{\prime}})^{p}\setminus\{0\}$ and $\tilde{B}_{1:q}\in(\mathbb{R}^{p^{\prime}\times q^{\prime}})^{q}$ such that, for all $\alpha\in\mathbb{R}$ , setting $A_{1:p}=A^{\star}_{1:p}+\alpha\tilde{A}_{1:p}$ and $B_{1:q}=B^{\star}_{1:q}+\alpha\tilde{B}_{1:q}$ , we have

[TABLE]

Two polynomial matrices with the same number $p^{\prime}$ of rows are said to be left coprime if they admit the identity matrix $\mathrm{I}_{p^{\prime}}$ as a greatest common $p^{\prime}\times p^{\prime}$ left divisor (g.c.l.d.), that is, every common left divisor of them is also a left divisor of $\mathrm{I}_{p^{\prime}}$ . The set of polynomial matrices of order $p^{\prime}$ is a non-commutative ring for $p^{\prime}>1$ . This is why for $p^{\prime}>1$ a notion of left (or right) divisor is necessary. Note, however that if $p^{\prime}=1$ , saying that $\mathrm{P}_{p}(\cdot;A^{\star}_{1:p})$ and $\mathrm{Q}_{q}(\cdot;B^{\star}_{1:q})$ are left coprime is equivalent to say that $\mathrm{P}_{p}(\cdot;A^{\star}_{1:p})$ and the $q^{\prime}$ row entries of $\mathrm{Q}_{q}(\cdot;B^{\star}_{1:q})$ (which is $q^{\prime}$ -dimensional row vector of polynomials of degree at most $q$ ) are coprime, that is, they have 1 as greater common divisor. In particular if $p^{\prime}=q^{\prime}=1$ , this boils down to say that $\mathrm{P}_{p}(\cdot;A^{\star}_{1:p})$ and $\mathrm{Q}_{q}(\cdot;B^{\star}_{1:q})$ have no common roots. The case $p^{\prime}>1$ is significantly more complicated and we refer to [22, Chapter III] for an excellent introduction on polynomials on Euclidean rings that applies to matrices of polynomials.

Proof.

Let $A^{\star}_{1:p}\in(\mathbb{R}^{p^{\prime}\times p^{\prime}})^{p}$ and $B^{\star}_{1:q}\in(\mathbb{R}^{p^{\prime}\times q^{\prime}})^{q}$ . In this proof section, for convenience we denote $\mathrm{P}_{p}(\cdot;A^{\star}_{1:p})$ and $\mathrm{Q}_{q}(\cdot;B^{\star}_{1:q})$ by $P^{\star}$ and $Q^{\star}$ .

Proof of Assertion (i). Suppose that $P^{\star}$ and $Q^{\star}$ are left coprime. The Bezout theorem for matrices of polynomials (see e.g. [22, Theorem 3.1]) gives that there exists two polynomial matrices $R$ and $S$ of order $p^{\prime}\times p^{\prime}$ and $q^{\prime}\times p^{\prime}$ respectively, such that

[TABLE]

Let $A_{1:p}\in(\mathbb{R}^{p^{\prime}\times p^{\prime}})^{p}$ and $B_{1:q}\in(\mathbb{R}^{p^{\prime}\times q^{\prime}})^{q}$ , and denote $P=\mathrm{P}_{p}(\cdot;A^{\star}_{1:p})$ and $Q=\mathrm{Q}_{q}(\cdot;B^{\star}_{1:q})$ . The “if” in Assertion (i) is obvious, so we only need to assume that

[TABLE]

and prove that $P=P^{\star}$ (in which case we also get that $Q=Q^{\star}$ ). Define the rational matrix $U=PP^{*-1}$ . Multiplying both sides of (4.9) by $U$ from the left, we have

[TABLE]

where we used (4.10) in the second equality. Hence we get that $U$ is a polynomial matrix and since $UP^{*}=P$ and both $P^{*}$ and $P$ are of the form $\mathrm{I}_{p^{\prime}}z^{p}+$ a polynomial of degree at most $p-1$ , we get that $U=\mathrm{I}_{p^{\prime}}$ and so $P^{*}=P$ , which concludes the proof of (i).

Proof of Assertion (ii). Suppose that $P^{\star}$ and $Q^{\star}$ are not left coprime. Let $D$ be a g.c.l.d. of $(P^{\star},Q^{\star})$ . Then $D$ is a polynomial matrix that left-divides $P^{\star}$ and $Q^{\star}$ and changing this polynomial matrix won’t change the rational matrix $P^{*-1}Q^{\star}$ . The difficulty is to show that we can modify a left divisor of $P^{\star}$ and $Q^{\star}$ in such a way that the resulting $P$ and $Q$ are still of the form (4.8) for some well chosen $A_{1:p}\in(\mathbb{R}^{p^{\prime}\times p^{\prime}})^{p}$ and $B_{1:q}\in(\mathbb{R}^{p^{\prime}\times q^{\prime}})^{q}$ . To this end we must first choose $D$ in a special form. Indeed, for any unimodular polynomial matrix $U$ , $DU$ is also a g.c.l.d. of $(P^{\star},Q^{\star})$ . The polynomial matrix $DU$ is called a right-associate of $D$ , and by [22, Theorem 22.1], we can choose $U$ so that $DU$ is in Hermite normal form, that is such that $DU$ is triangular inferior,

[TABLE]

with the polynomial $h_{i,i}$ is unitary with degree $\mathrm{deg}(h_{i,i})$ strictly larger than those of $h_{j,i}$ for all $j>i$ (that is, the degree on the diagonal dominates those of the same column). Let

[TABLE]

From what precedes, this min exists (the set is not empty), otherwise we would have $DU=\mathrm{I}_{p^{\prime}}$ and $P^{\star}$ and $Q^{\star}$ would be left coprime. We can thus write, for some $k\in\{1,\dots,p^{\prime}\}$ ,

[TABLE]

where $\mathrm{0}_{k,\ell}$ is the zero matrix of size $k\times\ell$ . By convention, if $k=1$ , $DU$ reduces to $T$ , that is back to its previous form. The important point is that with this definition of $k$ , we know that $h_{k,k}$ is unitary with $\mathrm{deg}(h_{k,k})\geq 1$ . Now, since $DU$ is a left divisor of $P^{\star}$ and $Q^{\star}$ we may write

[TABLE]

for some matrices $R$ and $S$ of respective sizes $p^{\prime}\times p^{\prime}$ and $p^{\prime}\times q^{\prime}$ . We write $R$ and $S$ in a block matrix form compatible with that of $DU$ , that is

[TABLE]

with $R^{(1)}$ and $S^{(1,1)}$ of respective sizes $(k-1)\times(k-1)$ and $(k-1)\times q^{\prime}$ . (again with convention that these matrices vanish if $k=1$ ). Then we have

[TABLE]

In particular the first row of $TR^{(2)}$ is $h_{k,k}$ times the first row of $R^{(2)}$ , and since $h_{k,k}$ is unitary with $\mathrm{deg}(h_{k,k})\geq 1$ , the form of $P^{\star}$ implies that the first row of $R^{(2)}$ is made of polynomials of degrees at most $p-1$ and cannot be zero (since the degree of the $(k,k)$ entry row of $P^{\star}$ is exactly $p$ and all the other entries are zero). Similarly, the first row of $S^{(2)}$ is made of polynomials of degrees at most $q-2$ (since $Q^{\star}$ is of degree $q-1$ ). Let $\Delta_{k}$ denote the diagonal matrix od order $p^{\prime}$ with zeros on its diagonal except on the $k$ -th entry where it is 1. Since $\Delta_{k}R$ and $\Delta_{k}S$ only keeps the first rows of the block matrices $R^{(2)}$ and $S^{(2)}$ , respectively, and put all other entries to zero, we get from what precedes that $\Delta_{k}R$ and $\Delta_{k}S$ are of degree at most $p-1$ and $q-1$ , respectively, and that $\Delta_{k}R$ is not zero. Hence we may find $\tilde{A}_{1:p}\in(\mathbb{R}^{p^{\prime}\times p^{\prime}})^{p}\setminus\{0\}$ and $\tilde{B}_{1:q}\in(\mathbb{R}^{p^{\prime}\times q^{\prime}})^{q}$ (with $\tilde{B}_{1}=0$ ) such that

[TABLE]

Then, for all $\alpha\in\mathbb{R}$ , setting $A_{1:p}=A^{\star}_{1:p}+\alpha\tilde{A}_{1:p}$ and $B_{1:q}=B^{\star}_{1:q}+\alpha\tilde{B}_{1:q}$ , we have

[TABLE]

and, similarly,

[TABLE]

Then we get

[TABLE]

which concludes the proof of Assertion (ii). ∎

5. Applications

We now use our results to derive necessary and sufficient conditions for having identifiability in the examples of Section 3. Many other examples can be achieved by combining various observation kernels and link function with or without exogenous covariates.

5.1. Standard LODMs

Let us apply the results of Section 4.2 in the case of standard LODMs as defined in Section 3.1. The moment assumption (L-3) can be readily used for standard LODMs with known or unknown observation kernel, with $|\cdot|$ in (4.4) denoting the usual absolute value. The other assumptions of the general VLODM listed in Section 4.2 can be simplified as follows.

For a standard LODM, Assumption (L-1) becomes

(SL-1)

For all $\theta=(\omega,a_{1:p},b_{1:q})$ , we have $1-\sum_{k=1}^{p}a_{k}z^{k}\neq 0$ for all $z\in\mathbb{C}$ such that $|z|\leq 1$ .

In the case of a standard LODM with unknown observation kernel it becomes

(SL’-1)

For all $\theta=(\vartheta,\varphi)\in\Theta$ with $\vartheta=(\omega,a_{1:p},b_{1:q})$ , we have $z-\sum_{k=1}^{p}a_{k}z^{k}\neq 0$ for all $z\in\mathbb{C}$ such that $|z|\leq 1$ .

As for (L-2), it becomes

(SL-2)

For all $x\in\mathsf{X}$ , $\Upsilon(Y)$ does not degenerate to a single point for $Y\sim G(x,\cdot)$ , that is, for all $u\in\mathbb{R}$ , we have $G(x,\{\Upsilon(\cdot)=u\})<1$ ;

and, in case of an unknown observation kernel,

(SL’-2)

For all $\varphi\in\Phi$ , $x\in\mathsf{X}$ and $u\in\mathbb{R}$ , we have $G^{\varphi}(x,\{\Upsilon(\cdot)=u\})<1$ .

Finally (B-1) becomes

(SL-3)

For all $x,x^{\prime}\in\mathsf{X}$ , $G(x;\cdot)=G(x^{\prime};\cdot)$ if and only if $x=x^{\prime}$ ;

and, in the case of an unknown observation kernel, it reads as

(SL’-3)

For all $\theta=(\vartheta,\varphi)$ and ${\theta_{\star}}=(\vartheta_{\star},\varphi_{\star})$ in $\Theta$ , for all $x,x^{\prime}\in\mathsf{X}$ , we have

[TABLE]

This says that the class $[{\theta_{\star}}]_{G}$ in (B-1) is given by $[{\theta_{\star}}]_{G}=\left\{\theta=(\vartheta,\varphi)\in\Theta:\varphi=\varphi_{\star}\right\}$ .

Remarkably, all LODMs share the same necessary and sufficient condition for identifiability, which can be expressed as follows, using $a^{\star}_{1:p}$ and $b^{\star}_{1:q}$ to denote the true linear coefficients of the linear link function.

(SL-4)

The polynomials $z^{p}-\sum_{k=1}^{p}a^{\star}_{k}z^{p-k}$ and $\sum_{k=0}^{q-1}b^{\star}_{k+1}\ z^{q-1-k}$ have no common complex roots.

We can now state the following result, which says that the true parameter ${\theta_{\star}}$ in the interior of $\Theta$ is identifiable if and only if (SL-4) holds.

Theorem 2.

Consider a standard LODM $(p,q)$ satisfying (A-1) and (L-3) for some $p,q\in\mathbb{Z}_{>0}$ , and suppose that $0\in\mathrm{U}$ . In the case of a known observation kernel, suppose that (SL-1)–(SL-3) hold. In the case of an unknown observation kernel, suppose that (SL’-1)–(SL’-3) hold. Then the inclusion in (2.30) is an equality for all ${\theta_{\star}}\in\Theta$ . In the case of a known observation kernel, Assertions (i) and (ii) below hold for any ${\theta_{\star}}=(\omega^{\star},a^{\star}_{1:p},b^{\star}_{1:q})\in\Theta$ . In the case of a known observation kernel, Assertions (i) and (iii) below hold for any ${\theta_{\star}}=(\omega^{\star},a^{\star}_{1:p},b^{\star}_{1:q},\varphi_{\star})\in\Theta$ .

(i)

Condition (SL-4) implies that $[{\theta_{\star}}]$ reduces to the singleton $\{{\theta_{\star}}\}$ . 2. (ii)

If Condition (SL-4) does not hold, then there exists an open segment $I^{\star}\subset\mathbb{R}^{1+p+q}$ of positive length and containing ${\theta_{\star}}$ such that $I^{\star}\cap\Theta\subset[{\theta_{\star}}]$ . 3. (iii)

If (SL-4) does not hold, then there exists an open segment $I^{\star}\subset\mathbb{R}^{1+p+q}$ of positive length and containing $(\omega^{\star},a^{\star}_{1:p},b^{\star}_{2:q})$ such that $\left\{(\vartheta,\varphi_{\star})\in\Theta:\vartheta\in I^{\star}\right\}\subset[{\theta_{\star}}]$ .

Remark 8.

Let us briefly comment this result.

(1)

The ergodicity of all the examples of Section 3.1 have been studied in the provided references and the parameter set $\Theta$ is always chosen to satisfy the assumptions of Theorem 2 in these references. 2. (2)

If $p=q=1$ , condition (SL-4) is reduced to $b_{1}^{\star}\neq 0$ . Let us see what $b_{1}^{\star}=0$ would imply about the identifiability of the model in this simple case. Taking $\tilde{\psi}^{\theta}$ as in (2.8) with $p=q=p^{\prime}=q^{\prime}=1$ , if $\boldsymbol{\omega}(\theta)=\omega$ , $A_{1}(\theta)=a^{\star}_{1}$ and $B_{1}(\theta)=b_{1}^{\star}=0$ , then $\{X_{k}\,:\,k\in\mathbb{Z}_{\geq 0}\}$ is a deterministic sequence which, under the stationary distribution, has to be constantly equal to $x^{\star}=\frac{\omega^{\star}}{1-a_{1}^{\star}}$ . But since the distribution of $\{Y_{n}\,:\,n\in\mathbb{Z}\}$ is then uniquely defined by this constant, if one can find a parameter $\theta$ with corresponding coefficients $\omega,a_{1},b_{1}$ such that $b_{1}=0$ , $(\omega,a_{1})\neq(\omega^{\star},a_{1}^{\star})$ yielding the same constant $\omega/(1-a_{1})=\omega^{\star}/(1-a_{1}^{\star})$ , we see that the model is not identifiable. 3. (3)

Condition (SL-4) holds for “many” parameters $a^{\star}_{1:p},b^{\star}_{1:q}$ , e.g. for Lebesgue almost all ones in $\mathbb{R}^{p+q}$ . 4. (4)

The identifiability condition (SL-4) is a well known sufficient condition in the standard GARCH $(p,q)$ models, see [14, (A4)] or [2, Condition (2.27)]. Assertion (ii) in Theorem 2 shows that it is also necessary at least for all parameters in the interior set of $\Theta$ . 5. (5)

Suppose that $\mathrm{U}$ and $\mathsf{X}$ both contain at least two different points and take $a^{\star}_{1:p}$ and $b^{\star}_{1:q}$ to be all non-zero. Then it is easy to show for a standard LODM with known observation kernel that $\tilde{\Psi}^{\theta}_{u}(z)=\tilde{\Psi}^{\theta_{\star}}_{u}(z)$ for all $u\in\mathrm{U}$ and $z\in\mathsf{Z}$ implies $\theta={\theta_{\star}}$ and thus, we get that the left-hand side of (2.16) reduces to the singleton $\{{\theta_{\star}}\}$ . Since, as explained previously, (SL-4) is necessary to have $[{\theta_{\star}}]=\{{\theta_{\star}}\}$ for all ${\theta_{\star}}$ in the interior set of $\Theta$ , we easily get examples for which the inclusion in (2.16) is strict. 6. (6)

Theorem 2 can be applied to all the models mentioned in Section 3.1. Let us examine the case of the MPINGARCH( $p,q$ ) model of [25], which constitutes a rich class of integer valued models. An MPINGARCH( $p,q$ ) model is an LODM( $p,q$ ) model with unknown observation kernel $G^{\varphi}(x,\cdot)$ defined as a mixed Poisson distribution with mean $x$ and variance proportional to $\varphi^{-1}$ , and $\Upsilon(y)=y$ . In [25, Theorem 1], the sufficient conditions for having (A-1) imply (L-3) (since $\mathbb{E}^{\theta}[|U_{0}]<\infty$ ) and (SL-1) (since they imply $\sum_{i}a_{i}<1$ , with $a_{i}\geq 0$ ). Conditions (SL’-1) and (SL’-2) also hold by definition of $G^{\varphi}(x,\cdot)$ . Hence Theorem 2 applies and we get that (SL-4) is a sufficient condition for identifiablity. It is also necessary by Assertion (iii) of the theorem, at least for parameters in the interior of $\Theta$ . This condition seems to be missing in [25, Theorem 2].

Proof of Theorem 2.

We only consider the case with unknown observation kernel (the case with known observation kernel is obtained by removing the additional parameter $\varphi$ ).

Let ${\theta_{\star}}=(\vartheta_{\star},\varphi_{\star})\in\Theta$ with $\vartheta_{\star}=(\omega^{\star},a^{\star}_{1:p},b^{\star}_{1:q})$ . As explained previously, the assumptions of Theorem 2 are adapted from those derived in Section 4.2. In particular, we have that (L-1)–(L-3) hold. Hence Section 4.2 implies that (A-3)–(A-7) hold in the general setting with $\mathrm{U}=\mathbb{R}$ . Applying Theorem 1, we get that (A-2) holds and that

[TABLE]

where $\langle{\theta_{\star}}\rangle$ and $\mathrm{E}^{\theta_{\star}}$ are as in Section 2.3 and Section 2.3. Remember that (SL’-3) says that (B-1) holds with

[TABLE]

By Section 4.1, we get that the inclusion in (2.30) is an equality and

[TABLE]

Note that we assumed that $0\in\mathrm{U}$ and that (SL’-2) implies $\mathrm{U}\neq\{0\}$ , hence we can apply Section 4.2 which gives that $[{\theta_{\star}}]$ is the set of all $\theta=(\omega,a_{1:p},b_{1:q},\varphi_{\star})\in\Theta$ such that

[TABLE]

Applying Section 4.2, we easily get Assertions (i) and (iii). ∎

5.2. The bivariate example of Section 3.2

Theorem 2 can be extended to the standard VLODM case of Section 3.2. Here, for brevity, we do not re-express the general VLODM assumptions (L-1)-(L-3) in the standard setting as we did previously for standard LODMs. We only need to introduce the condition

(SL’-4)

The polynomials $z^{p}-\sum_{k=1}^{p}A^{\star}_{k}z^{p-k}$ and $\sum_{k=0}^{q-1}B^{\star}_{k+1}\ z^{q-1-k}$ are left-coprime,

which extends (SL-4) to the case $p^{\prime},q^{\prime}\geq 1$ . The proof of the following result mimics the one of Theorem 2 and is thus omitted.

Theorem 3.

Consider a standard VLODM $(p,q,p^{\prime},q^{\prime})$ satisfying (A-1) for some $p,q,p^{\prime},q^{\prime}\in\mathbb{Z}_{>0}$ . Suppose that $0\in\mathrm{U}$ and that (L-1)-(L-3) and (SL’-3) hold. Then, for all ${\theta_{\star}}=(\boldsymbol{\omega}^{\star},A^{\star}_{1:p},B^{\star}_{1:q},\varphi_{\star})\in\Theta$ , the inclusion in (2.30) is an equality and the two following assertions hold.

(i)

Condition (SL’-4) implies that $[{\theta_{\star}}]$ reduces to the singleton $\{{\theta_{\star}}\}$ . 2. (ii)

If (SL’-4) does not hold, then there exists an open segment $I^{\star}\subset\mathbb{R}^{1+p+q}$ of positive length and containing $(\omega^{\star},a^{\star}_{1:p},b^{\star}_{2:q})$ such that $\left\{(\vartheta,\varphi_{\star})\in\Theta:\vartheta\in I^{\star}\right\}\subset[{\theta_{\star}}]$ .

Remark 9.

In Theorem 3, for brevity, we only stated the case with unknown observation kernel. The case with known observation kernel follows by removing the parameters $\varphi$ and $\varphi_{\star}$ in the statement and by replacing (SL’-3) by (SL-3).

Since the bivariate integer valued GARCH model of Section 3.2 is a standard VLODM( $1,1,2,2$ ) with unknown observation kernel, we just need to check the assumptions of Theorem 3. Ergodicity (hence our Assumption (A-1)) is stated in [9, Theorem 1] under their set of condition (a) on the parameter $\theta=(\boldsymbol{\omega},A,B,\varphi)$ . Their conditions for ergodicity implies some operator norm of $A$ to be strictly less than 1, which implies $\mathrm{I}_{2}-A\;z$ to be invertible for $|z|\leq 1$ and thus (L-1) holds. Since for all $x_{1},x_{2}>0$ , the bivariate distribution defined by (3.1) has positive probability on all points $(y_{1},y_{2})\in\mathbb{Z}_{\geq 0}^{2}$ , it cannot have probability one on a line of $\mathbb{R}^{2}$ , hence (L-2). Also it is claimed following [9, Theorem 1] that $\mathbb{E}^{\theta}[U_{0}]$ is well defined and thus (L-3) holds. Applying Theorem 3, we get that, for any interior point ${\theta_{\star}}$ of the parameter space, Condition (SL’-4) is a necessary and sufficient condition to have identifiability of ${\theta_{\star}}$ . In the bivariate integer valued GARCH model (for which $p=q=1$ ), this condition reads for ${\theta_{\star}}=(\boldsymbol{\omega}^{\star},A^{\star},B^{\star},\varphi^{\star})$ as $\mathrm{I}_{2}\,z-A^{\star}$ and $B^{\star}$ to be left coprime. If ${\theta_{\star}}$ does not satisfy this condition, consistent estimation of ${\theta_{\star}}$ is not possible. Hence we believe that this assumption is missing in [9, Theorem 2]. A precise counter-example is for instance obtained by setting

[TABLE]

where $\alpha,\beta>0$ are arbitrary (and chosen in order to make ${\theta_{\star}}=(\boldsymbol{\omega}^{\star},A^{\star},B^{\star},\varphi^{\star})$ in the interior of $\Theta$ ). One can show that $\mathrm{I}_{2}\,z-A^{\star}$ and $B^{\star}$ are not left coprime since they both admit the same non-unimodular left divisor $L=\begin{bmatrix}(z-\alpha)&1\\ (-\alpha)&1\end{bmatrix}$ as shown by the following identities:

[TABLE]

5.3. The non-linear GARCH of Section 3.3

Consider Case 1) of Section 3.3, for which $\delta$ is not included in the set of parameters. Then the non-linear GARCH model is a standard VLODM( $p,q,1,2$ ) model and identifiability can be treated using Theorem 3. Using that $\Upsilon(y)=((y^{+})^{\delta},(y^{-})^{\delta})$ , for all $x>0$ , $\tilde{G}(x,\cdot)$ in (2.6) (we omit $\theta$ as $G$ does not depend on $\theta$ here) has support included in $\mathbb{R}_{\geq 0}\times\{0\}\cup\{0\}\times\mathbb{R}_{\geq 0}$ , where it is defined, for all Borel set $A\subset\mathbb{R}_{\geq 0}$ by

[TABLE]

It follows that our Assumption (L-2) is equivalent to having that $0<\mathbb{P}(\eta>0)<1$ and that there is no pair $\{u,v\}$ , $u\neq v\in\mathbb{R}$ , such that $\mathbb{P}(\eta\in\{u,v\})=1$ , which is exactly the condition appearing in the second part of [17, A3]. Our conditions (L-1) (which here, since $A_{k}=a_{k}\geq 0$ for all $k=1,\dots,p$ simply reads $\sum_{k}a_{k}<1$ ) and (L-3) are usual byproducts of showing the ergodicity condition (A-1), see [17, Appendix A]. One can thus apply our Theorem 3 (in its know observation kernel version, see Remark 9) and obtain the necessary and sufficient condition (SL’-4) which in the case where the parameters $A_{k}$ are denoted by $a_{k}\in\mathbb{R}$ for $k=1,\dots,p$ and the parameters $B_{k}$ denoted by $\mathbf{b}_{k}\in\mathbb{R}^{2}$ for $k=1,\dots,q$ , becomes, for any ${\theta_{\star}}=(\omega^{\star},a^{\star}_{1:p},\mathbf{b}^{\star}_{1:q})$ with $\mathbf{b}^{\star}_{k}=\begin{bmatrix}b^{\star}_{k}(1)&b^{\star}_{k}(2)\end{bmatrix}$ for $k=1,\dots,q$ ,

(NLG-1)

The polynomial $z^{p}-\sum_{k=1}^{p}a^{\star}_{k}z^{p-k}$ have no common complex roots neither with the polynomial $\sum_{k=0}^{q-1}b^{\star}_{k+1}(1)\ z^{q-1-k}$ nor with the polynomial $\sum_{k=0}^{q-1}b^{\star}_{k+1}(2)\ z^{q-1-k}$ .

This condition is similar to that appearing in the identifiability condition [17, A4] used in a mis-specified context. Our result shows that this condition is necessary in the interior of the parameter set in the well-specified case, and remains valid for much larger choices of observation kernels.

5.4. The SETPAR model of Section 3.4

We have the following result for the self-excited threshold Poisson autoregression model.

Theorem 4.

Consider the SETPAR model introduced in Section 3.4. Let

[TABLE]

Then (A-1) holds. Let ${\theta_{\star}}=(\omega_{1}^{\star},\omega_{2}^{\star},a_{1}^{\star},a_{2}^{\star},b_{1}^{\star},b_{2}^{\star},r^{\star})\in\Theta$ satisfy at least one of the two following conditions.

(i)

$b_{1}^{\star}>0$ * and $r^{\star}\geq 1$ ;* 2. (ii)

$b_{2}^{\star}>0$ .

Then we have

[TABLE]

where $\psi^{\theta}_{y}(x)$ is defined by (3.2).

Remark 10.

Let us briefly comment this result.

(1)

The case where neither (i) nor (ii) hold ( $b_{1}^{\star}=b_{2}^{\star}=0$ or $r^{\star}=b_{2}^{\star}=0$ ) is somehow degenerate, similarly to the non-threshold case mentioned in Remark 8 (2). We think it should be treated separately but we omit this very special case here for brevity. 2. (2)

As explained after (2.16), the identity (5.1) is the best we could hope for this model since the distribution $\tilde{\mathbb{P}}^{\theta}$ of the observations is entirely determined by the mapping $(u,x)\mapsto\psi^{\theta}_{u}(x)$ on $\mathbb{Z}_{\geq 0}\times\mathbb{R}_{\geq 0}$ . 3. (3)

The identity (5.1) shows in particular that ${\theta_{\star}}$ is not identifiable if $r^{\star}=0$ (since changing $b^{\star}_{1}$ will have no effect on the mapping $(u,x)\mapsto\psi^{\theta}_{u}(x)$ on $\mathbb{Z}_{\geq 0}\times\mathbb{R}_{\geq 0}$ ). Another case of non-identifiability is when $a_{1}^{\star}=a_{2}^{\star}$ and $\omega_{1}^{\star}+b_{1}^{\star}(r+1)=\omega_{2}^{\star}+b_{2}^{\star}(r+1)$ . In such a case, we have for all $x\in\mathbb{R}$ ,

[TABLE]

Then, setting $\theta=(\omega_{1}^{\star},\omega_{2}^{\star},a_{1}^{\star},a_{2}^{\star},b_{1}^{\star},b_{2}^{\star},r^{\star}+1)$ , we immediately have that $\psi^{\theta}_{y}(x)=\psi^{\theta_{\star}}_{y}(x)$ for all $x\in\mathbb{R}$ and $y\in\mathbb{Z}_{\geq 0}$ . In particular consistent estimation of ${\theta_{\star}}$ as claimed in [28, Theorem 2] is not possible for such a parameter ${\theta_{\star}}$ .

Proof of Theorem 4.

A natural choice for $\mathsf{X}$ is $\mathbb{R}_{>0}$ but in order to meet Assumption (A-3) with $\boldsymbol{\delta}_{\mathsf{X}}(x,x^{\prime})=|x-x^{\prime}|$ we take $\mathsf{X}=\mathbb{R}_{\geq 0}$ with $G^{\theta}(0,\cdot)$ arbitrarily set to be Bernoulli with mean $1/2$ for convenience (it actually has no influence on $\mathbb{P}^{\theta}$ since $\omega_{0},\omega_{1}>0$ in the condition on $\Theta$ ). We set $\Upsilon(y)=y$ so that $U_{k}=Y_{k}$ for all $k$ and the reduced link function $\tilde{\psi}$ is the same as the non-reduced one. Moreover since $p=q=1$ , we are in the case where $Z_{k}=X_{k}$ for all $k$ , $\mathbb{Z}=\mathsf{X}=\mathbb{R}_{\geq 0}$ and $\tilde{\Psi}_{u}^{\theta}=\tilde{\psi}_{u}^{\theta}$ for all $u$ . By [28, Theorem 1], with $\Theta$ satisfying the given condition, Assumption (A-1) holds, and, moreover, for any $\ell>0$ and $\theta\in\Theta$ , $\mathbb{E}^{\theta}[U_{0}^{\ell}]<\infty$ (in fact, on can prove that, for any $\theta\in\Theta$ , there exists $\ell>0$ such that $\mathbb{E}^{\theta}[\exp(\ell U_{0})]<\infty$ ). This moment condition implies that the log moment condition (A-5) holds for any $x^{(\text{\tiny{i}})}_{1}\in\mathsf{X}$ . Clearly, we have $\mathrm{Lip}^{\theta}_{1}=a_{1}\vee a_{2}$ and, since $p=q=1$ , $\mathrm{Lip}^{\theta}_{n}\leq(\mathrm{Lip}^{\theta}_{1})^{n}$ for all $n\geq 1$ . Thus the above condition on $\Theta$ also implies (A-4). As for (A-6), it trivially holds (since $y=u$ is fixed in this condition). Hence with Section 2.3, we get that (A-1)–(A-6) holds, with definitions (2.26) for checking (A-2). Assumption (B-1) is also immediate with $[\theta]_{G}=\Theta$ and Section 4.1 gives that, for any ${\theta_{\star}}\in\Theta$ ,

[TABLE]

where $\langle{\theta_{\star}}\rangle$ can be defined by (2.31). Assumption (A-7) holds by Remark 6 (1). Hence all the assumptions of Theorem 1 hold. Take now ${\theta_{\star}}=(\omega_{1}^{\star},\omega_{2}^{\star},a_{1}^{\star},a_{2}^{\star},b_{1}^{\star},b_{2}^{\star},r^{\star})\in\Theta$ satisfying (i) or (ii). Let $\theta\in\Theta$ , $u\in\mathrm{U}$ and $z\in\mathsf{Z}$ . To conclude the proof, it is now sufficient to check that if $\tilde{\Psi}^{\theta}_{u}$ and $\tilde{\Psi}^{\theta_{\star}}_{u}$ coincide on $\left\{\tilde{\Psi}^{{\theta_{\star}}}\langle v\rangle(z):n\in\mathbb{Z}_{\geq 0},\,v\in\mathrm{U}^{n}\right\}$ , then they must coincide on $\mathsf{Z}$ , so that we can apply Section 4.1. Observe that by definition of the link function in (3.2), if (i) or (ii) holds, then $v\mapsto\psi_{v}^{\theta_{\star}}(z)$ takes at least two different values on $\mathrm{U}=\mathbb{Z}_{\geq 0}$ . Since $\tilde{\Psi}^{{\theta_{\star}}}\langle v\rangle(z)=\psi_{v}^{\theta_{\star}}(z)$ for all $v\in\mathrm{U}$ , these two different values belong to $\left\{\tilde{\Psi}^{{\theta_{\star}}}\langle v\rangle(z):n\in\mathbb{Z}_{\geq 0},\,v\in\mathrm{U}^{n}\right\}$ . Now, since $\tilde{\Psi}^{\theta_{\star}}_{u}=\psi_{u}^{\theta_{\star}}$ and $\tilde{\Psi}^{\theta}_{u}=\psi_{u}^{\theta}$ are affine functions, if they coincide in two points they must coincide everywhere, and the proof is concluded. ∎

5.5. The PARX model of Section 3.5

We have the following result for the Poisson autoregression model with exogenous covariates.

Theorem 5.

Consider the PARX model defined in Section 3.5, which is a VLODMX( $p,q,1,1+d$ ). Suppose that (A’-1), (L-1) and (L-3) hold, and that the exogenous kernel $H$ satisfies the following.

(P-1)

We have $H(v;\{f_{1:d}(\cdot)\in A\})<1$ for all $v\in\mathsf{V}$ and affine hyperplanes $A\subset\mathbb{R}^{d}$ .

Then, for all ${\theta_{\star}}\in\Theta$ , the equivalent class $[{\theta_{\star}}]$ reduces to the singleton $\{{\theta_{\star}}\}$ .

Remark 11.

Let us briefly comment this result.

(1)

(P-1) is a natural assumption as it basically says that the covariates $f_{1}(V_{k}),\dots,f_{d}(V_{k})$ are not linearly related conditionally to $V_{k-1}$ . If they were, it would suggest using a smaller set of covariates. 2. (2)

The ergodicity of PARX models (our assumption (A’-1) is treated in [1, Theorem 1] under some assumption on the covariate kernel $H$ (in their assumption 2). Their assumption 3 used for proving (A’-1) implies $\sum_{k}a_{k}<1$ which implies our assumption (L-1) (since $a_{k}\geq 0$ for all $k=1,\dots,p$ ). Note also that [1, Theorem 1] implies that $\mathbb{E}^{\theta}[|U_{0}|]<\infty$ and thus our assumption (L-3). On the other hand, their identifiability condition [1, Assumption 5] include a condition on parameters $a^{\star}_{1:p},b^{\star}_{1:q}$ similar to our condition (SL-4) used for standard LODM’s in Theorem 2 above. Theorem 5 shows that such a condition can in fact be dropped in case of exogenous covariates provided that the mild condition (P-1) holds. 3. (3)

This result is similar to Theorem 2 for the standard LODM. It is of interest to note that there is no additional condition on $\gamma^{\star}_{1:d}$ for identifiability. 4. (4)

Theorem 5 easily extends to more general observation kernels (known or unknown) $G^{\theta}$ , provided that, as in Theorem 2, assumptions (SL-2)-(SL-3) hold if the observation kernel is known, or (SL’-2)-(SL’-3) if it is unknown. However, the ergodicity would require a specific treatment in these cases, as only the Poisson case has been considered up to our knowledge.

Proof of Theorem 5.

For the PARX model we set $\Upsilon(y,v)=(y,f_{1}(v),\dots,f_{d}(v))\in\mathrm{U}=\mathbb{R}^{1+d}$ and $\mathsf{X}=\mathbb{R}_{\geq 0}$ (with $G(0,\cdot)$ arbitrarily set, say, to be Bernoulli with mean $1/2$ , as in the proof of Theorem 4) and (A-3) holds with $\boldsymbol{\delta}_{\mathsf{X}}(x,x^{\prime})=|x-x^{\prime}|$ and $\boldsymbol{\delta}_{\mathrm{U}}(u,u^{\prime})=|u-u^{\prime}|$ where $|\cdot|$ here is an arbitrary norm on $\mathbb{R}^{1+d}$ . In this case, we have that, for all $(x,v)\in\mathsf{X}\times\mathsf{V}$ , if $(Y,W)\sim\tilde{G}((x,v),\cdot)$ with $Y$ valued in $\mathbb{R}$ , $W$ valued in $\mathbb{R}^{d}$ and $\tilde{G}$ defined by (2.7), we have that $Y$ and $W$ are independent and $Y$ follows a Poisson distribution. Thus (L’-2) is equivalent to (P-1). We can thus apply Section 4.2 and get that Assumptions (A-3), (A-4), (A-5), (A-6) and (A’-7) hold with $\mathrm{U}=\mathbb{R}^{1+d}$ . Then, having assumed (A’-1), we can apply Theorem 1 and get that (A-2) holds as well as the identity (4.1). Assumption é(B-1) is immediate with $[{\theta_{\star}}]_{G}=\Theta$ and, Applying Section 4.1 and the previous display we get that

[TABLE]

where $\langle{\theta_{\star}}\rangle$ and $\mathrm{E}^{\theta_{\star}}$ are as in Section 2.3 and Section 2.3. Section 4.2 and the definitions of $A_{1:p}$ and $B_{1:q}$ in Section 3.5 now give that $[{\theta_{\star}}]$ is the set of all $\theta=(\omega,a_{1:p},b_{1:q},\gamma_{1:d})\in\Theta$ such that

[TABLE]

Note that the last line in this set of equations is equivalent to

[TABLE]

But the the two first lines of the previous display give that $\omega=\omega^{\star}$ and $b^{\star}_{1:q}=b_{1:q}$ , thus $\theta={\theta_{\star}}$ and the proof is concluded. ∎

6. Postponed Proofs

6.1. Proof of Section 2.3

We first derive the following result.

Lemma \thelemma.

(A-4)* implies that for all $\theta\in\Theta$ , there exist $C>0$ and $\rho\in(0,1)$ such that $\mathrm{Lip}_{n}^{\theta}\leq C\ \rho^{n}$ for all $n\in\mathbb{Z}_{>0}$ .*

Proof.

By (2.22), (2.23) and (2.15), we have, for all $n\in\mathbb{Z}_{>0}$ , using the convention $\mathrm{Lip}^{\theta}_{m}=1$ for $m\leq 0$ ,

[TABLE]

Hence (A-4) implies that there exists $m\geq 1$ and $L\in(0,1)$ such that, for all $u\in\mathrm{U}^{m+1}$ , $\tilde{\Psi}^{\theta}\langle u\rangle$ is $L$ -Lipschitz. Now observe that, by (2.14), for all $n=km+r$ with $k\geq 0$ and $0\leq r<m$ , for all $u=u_{-n:0}\in\mathrm{U}^{n+1}$ , we can write $\tilde{\Psi}^{\theta}\langle u\rangle$ as

[TABLE]

and in this composition, the $k$ first functions are $L$ Lipschitz and the last one is $L^{\prime}=1\vee\max\left\{\mathrm{Lip}^{\theta}_{j}:0<j\leq m\right\}$ -Lipschitz. Hence, for all $z,z^{\prime}\in\mathsf{Z}$ ,

[TABLE]

Hence the result by setting $\rho=L^{1/m}\in(0,1)$ . ∎

We can now prove Section 2.3. Let $\theta,{\theta_{\star}}\in\Theta$ and let $z^{(\text{\tiny{i}})}\in\mathsf{Z}$ . Denote for all $n\in\mathbb{Z}_{\geq 0}$ ,

[TABLE]

Then, by (2.13) we have, for all $n\in\mathbb{Z}_{>0}$ ,

[TABLE]

and, by (2.22), we get

[TABLE]

Using (2.23) with $z^{(\text{\tiny{i}})}=(x^{(\text{\tiny{i}})},u^{(\text{\tiny{i}})})$ and $x^{(\text{\tiny{i}})}_{1}=\dots=x^{(\text{\tiny{i}})}_{p}$ and $u^{(\text{\tiny{i}})}_{1}=\dots=u^{(\text{\tiny{i}})}_{q-1}$ we get that

[TABLE]

Hence, for all $\alpha>0$ , Condition (2.24) implies as $n\to\infty$ ,

[TABLE]

The last display with (6.2) and Section 6.1 gives that, $\tilde{\mathbb{P}}^{\theta_{\star}}\mbox{-a.s.}$ , $\{X^{(n)}\,:\,n\in\mathbb{Z}_{\geq 0}\}$ is a Cauchy sequence, hence converges in $\mathsf{X}$ . Therefore, $U_{(-\infty):0}\in\mathrm{D}^{\theta}$ , $\tilde{\mathbb{P}}^{\theta_{\star}}\mbox{-a.s.}$ By (2.1) or (2.2) depending whether an ODM or an ODMX is considered, we also have, under $\mathbb{P}^{\theta}$ , for all $n\in\mathbb{Z}_{\geq 0}$ , $X_{1}=\tilde{\psi}^{\theta}\langle U_{-n:0}\rangle(Z_{-n})$ . Thus, (2.22) also implies

[TABLE]

By stationarity, $\boldsymbol{\delta}_{\mathsf{Z}}(Z_{-n},z^{(\text{\tiny{i}})})$ is bounded in probability under $\mathbb{P}^{\theta}$ , hence $X^{(n)}$ converges to $X_{1}$ in probability if (A-4) holds. We thus obtain (2.17), and since this holds for all $\theta\in\Theta$ , Assumption (A-2) holds.

Let us now check (2.30). Take ${\theta_{\star}},\theta\in\Theta$ and suppose that $\theta$ belongs to the set in left-hand side of the inclusion (2.30), that is, $G^{\theta}=G^{{\theta_{\star}}}$ and $\tilde{\Psi}^{\theta}_{u}(z)=\tilde{\Psi}^{{\theta_{\star}}}_{u}(z)$ for all $(z,u)\in\mathrm{E}^{\theta_{\star}}\times\mathrm{U}$ . By Remark 4 we have $Z_{k}\in\mathrm{E}^{\theta_{\star}}$ , $\mathbb{P}^{{\theta_{\star}}}\mbox{-a.s.}$ Thus we get that $\tilde{\Psi}^{\theta}_{U_{k}}(Z_{k})=\tilde{\Psi}^{{\theta_{\star}}}_{U_{k}}(Z_{k})$ $\mathbb{P}^{{\theta_{\star}}}\mbox{-a.s.}$ and with Remark 2, we obtain that, $\mathbb{P}^{{\theta_{\star}}}\mbox{-a.s.}$ , $(Y_{k},X_{k})$ (resp. $(Y_{k},X_{k},V_{k})$ ) satisfy the iterative equations (2.1) (resp. (2.2)), and by (A-1) (resp. (A’-1)), we conclude that $\mathbb{P}^{\theta}=\mathbb{P}^{\theta_{\star}}$ . Hence $\theta\in[{\theta_{\star}}]$ and (2.30) is proved.

Finally, we check that (2.21) holds when $\tilde{\psi}^{\theta}_{u}$ is continuous for all $u\in\mathrm{U}^{q}$ . Since we have shown that $U_{(-\infty):0}\in\mathrm{D}^{\theta}$ , $\tilde{\mathbb{P}}^{{\theta_{\star}}}\mbox{-a.s.}$ and $\tilde{\mathbb{P}}^{{\theta_{\star}}}$ is shift invariant, we have, for all $k\in\mathbb{Z}$ ,

[TABLE]

Observe that, for all $n\geq p\vee q$ , we have, for all $u_{(-n):0}\in\mathrm{U}^{n+1}$ ,

[TABLE]

By continuity of $\tilde{\psi}^{\theta}_{u}$ and using the previous display, we can take the limit as $n\to\infty$ under $\tilde{\mathbb{P}}^{{\theta_{\star}}}$ and obtain (2.21).

6.2. Proof of Section 4.1

First observe that (2.17) implies for all $\theta\in\Theta$ ,

[TABLE]

Let us now show that any $\theta\in[{\theta_{\star}}]$ belongs to $[{\theta_{\star}}]_{G}\cap\langle{\theta_{\star}}\rangle$ , that is, $\theta\in[{\theta_{\star}}]_{G}$ , and (2.20) and (2.21) hold true. Since $\tilde{\mathbb{P}}^{\theta}=\tilde{\mathbb{P}}^{\theta_{\star}}$ , (6.3), which also holds with $\theta$ replaced by ${\theta_{\star}}$ , yields

[TABLE]

By (B-1), we obtain that $\theta\in[{\theta_{\star}}]_{G}$ and (2.20) holds. By Section 2.3, (2.17) implies (2.18), and using $\tilde{\mathbb{P}}^{\theta}=\tilde{\mathbb{P}}^{\theta_{\star}}$ , we obtain (2.21). Thus $\theta\in\langle{\theta_{\star}}\rangle$ .

It remains to show that $[{\theta_{\star}}]_{G}\cap\langle{\theta_{\star}}\rangle\subseteq[{\theta_{\star}}]$ . We prove this inclusion in the case of an ODMX satisfying (A’-1). (The case of an ODM is readily obtained by removing the variables $V_{k}$ ’s in the reasoning). Let $\theta\in[{\theta_{\star}}]_{G}$ such that (2.20) and (2.21) hold true. Since (2.17) holds with $\theta$ replaced by ${\theta_{\star}}$ , (2.20) gives that $X_{1}=\tilde{\psi}^{\theta}\langle U_{(-\infty):0}\rangle$ $\mathbb{P}^{{\theta_{\star}}}\mbox{-a.s.}$ Since $\mathbb{P}^{{\theta_{\star}}}$ is shift invariant, we get, for all $k\in\mathbb{Z}$ , $X_{k+1}=\tilde{\psi}^{\theta}\langle U_{(-\infty):k}\rangle$ $\mathbb{P}^{{\theta_{\star}}}\mbox{-a.s.}$ With (2.21), we obtain $X_{1}=\tilde{\psi}^{\theta}_{U_{(-q+1):0}}\left(X_{(-p+1):0}\right)$ $\mathbb{P}^{\theta_{\star}}\mbox{-a.s.}$ Since $\mathbb{P}^{\theta_{\star}}$ is shift invariant, we thus have, for all $k\in\mathbb{Z}$ ,

[TABLE]

On the other hand, by definition of $\mathbb{P}^{\theta_{\star}}$ and using (B-1) with $\theta\in[{\theta_{\star}}]_{G}$ , we have that

[TABLE]

And using again that $\mathbb{P}^{\theta_{\star}}$ is shift-invariant, for all $k\in\mathbb{Z}$ ,

[TABLE]

This, with (6.4), shows that $\mathbb{P}^{\theta_{\star}}$ is a shift-invariant solution of (2.2). By (A’-1), we conclude that $\mathbb{P}^{\theta_{\star}}=\mathbb{P}^{\theta}$ , and thus $\theta\in[{\theta_{\star}}]$ .

6.3. Proof of Section 4.2

Assertion (i) is obvious.

Proof of Assertion (ii). In the vector linear setting we have, for all $n\in\mathbb{Z}_{>0}$ and all $(z,z^{\prime},u)\in\mathsf{Z}^{2}\times\mathrm{U}^{n}$ ,

[TABLE]

where $|\cdot|$ is a norm on $\mathbb{R}^{p^{\prime}}$ and, for all $n\in\mathbb{Z}_{>0}$ , $\check{\psi}_{n}^{\theta}$ is a linear mapping from $\mathsf{Z}=\mathbb{R}^{p^{\prime}*p+q^{\prime}*(q-1)}$ to $\mathsf{X}=\mathbb{R}^{p^{\prime}}$ recursively defined by setting, for all $w=w_{1:(p+q+1)}\in\mathsf{Z}=(\mathbb{R}^{p^{\prime}})^{p}\times(\mathbb{R}^{q^{\prime}})^{q-1}$ , $\check{\psi}_{n}^{\theta}(w)=x_{n}$ with

[TABLE]

In particular, we have, for all $j\geq q$ ,

[TABLE]

and this equation is also true for $j=1,\dots,q-1$ if the last $q-1$ , $\mathbb{R}^{q^{\prime}}$ -valued, component of $w$ are equal to zero. It is well known that the Lipshitz norm of such iterative linear functions goes to zero if and only if (L-1) holds.

Proof of Assertion (iii). Take an arbitrary $x^{(\text{\tiny{i}})}_{1}\in\mathsf{X}$ . If $q>1$ , take also an arbitrary $u^{(\text{\tiny{i}})}_{1}\in\mathrm{U}$ and set $u^{(\text{\tiny{i}})}=(u^{(\text{\tiny{i}})}_{1},\dots,u^{(\text{\tiny{i}})}_{1})\in\mathrm{U}^{q-1}$ . Then, since $\tilde{\psi}^{\theta}_{u}(x)$ is of the form (2.8), there exists constants $C_{1},C_{2}>0$ only depending on $\theta$ , $x^{(\text{\tiny{i}})}_{1}$ and $u^{(\text{\tiny{i}})}_{1}$ such that, for all $u\in\mathrm{U}$ ,

[TABLE]

Assertion (iii) follows.

Assertion (iv) is obvious.

Proof of Assertions (v) and (vi). See Remark 6 (2) in the case of a VLODM. The case of a VLODMX is similar.

6.4. Proof of Section 4.2

Note that, by Section 4.2 (ii), in the vector linear case, the set $\mathrm{E}^{\theta}$ of Section 2.3 is well defined under (L-1). We need the following result whose proof is straightforward, and thus omitted.

Lemma \thelemma.

Suppose that (L-1) holds and let $\theta\in\Theta$ . Let $\ell^{1}(\mathbb{Z},\mathrm{U})$ denote the set of sequences in $\mathrm{U}^{\mathbb{Z}}$ that are absolutely summable. For any $u\in\ell^{1}(\mathbb{Z},\mathrm{U})$ , there is a unique $x\in\ell^{\infty}(\mathbb{Z},\mathsf{X})$ (the set of bounded sequences valued in $\mathsf{X}$ ) such that

[TABLE]

This unique solution is given by

[TABLE]

where $\mathbf{R}$ is defined by (4.5) and $\hat{u}$ denotes the Fourier series of $u$ defined by

[TABLE]

Let $\mathrm{D}^{\theta}$ and $\mathrm{E}^{\theta}$ be as in (2.25) and Section 2.3. Then $\mathrm{D}^{\theta}$ contains $\ell^{1}(\mathbb{Z},\mathrm{U})$ and, for any $u\in\ell^{1}(\mathbb{Z},\mathrm{U})$ , defining $x$ as the unique solution of (6.6) in $\ell^{\infty}(\mathbb{Z},\mathsf{X})$ , we have, for all $t\in\mathbb{Z}$ ,

[TABLE]

We can now prove Section 4.2.

Proof of Section 4.2.

Step 1: Assertion (i) implies Assertion (ii). Let $\theta,{\theta_{\star}}\in\Theta$ satisfying Assertion (i) and let us show that (4.6) and (4.7) hold. Take any $u\in\ell^{1}(\mathbb{Z},\mathrm{U})$ and $n\in\mathbb{Z}_{\geq 0}$ . By Section 6.4, $u\in\mathrm{D}^{\theta_{\star}}$ and we have

[TABLE]

where the second equality follows from applying successively Assertion (i) with $u=u_{k}$ and $z=\tilde{\Psi}^{{\theta_{\star}}}\langle u_{(-\infty):(k-1)}\rangle$ for $k=-n,-n+1,\dots,0$ . On the other hand by definition of $\mathrm{Lip}_{n}^{\theta}$ in (2.22), we have, setting $z^{\star}_{n}:=\tilde{\Psi}^{{\theta_{\star}}}\langle u_{(-\infty):(-n-1)}\rangle$ and $z_{n}=\tilde{\Psi}^{\theta}\langle u_{(-\infty):(-n-1)}\rangle$ ,

[TABLE]

Since $u\in\ell^{1}(\mathbb{Z},\mathrm{U})$ , we have that $(z_{n})$ and $(z^{\star}_{n})$ are summable sequences (as a consequence of Section 6.4) and so $\boldsymbol{\delta}_{\mathsf{Z}}\left(z_{n},z^{\star}_{n}\right)$ is bounded as $n\to\infty$ . Using Section 4.2 (ii) and we conclude that the upper bound in the last display converges to 0 as $n\to\infty$ . By definition of $z^{\star}_{n}$ and the previous display, this gives that $\tilde{\Psi}^{\theta}\langle u_{(-\infty):0}\rangle=\tilde{\Psi}^{{\theta_{\star}}}\langle u_{(-\infty):0}\rangle$ . Shifting the sequence $u$ , we also have that $\tilde{\Psi}^{\theta}\langle u_{(-\infty):t}\rangle=\tilde{\Psi}^{{\theta_{\star}}}\langle u_{(-\infty):t}\rangle$ for all $t\in\mathbb{Z}$ and by Section 6.4, this implies that $\theta$ and ${\theta_{\star}}$ share the same unique solution $x\in\ell^{\infty}(\mathbb{Z},\mathsf{X})$ to the equation (6.6). Using the explicit form of this solution in the same lemma, we get that, for all $u\in\ell^{1}(\mathbb{Z},\mathrm{U})$ and $\tau\in\mathbb{Z}$ ,

[TABLE]

where, for all $\theta\in\Theta$ , $u\in\ell^{1}(\mathbb{Z},\mathrm{U})$ and $\tau\in\mathbb{Z}$ , we set $\displaystyle\alpha_{\tau}(u;\theta)=\int_{-\pi}^{\pi}\mathrm{e}^{\mathrm{i}\lambda\,\tau}\mathbf{R}(\mathrm{e}^{\mathrm{i}\lambda};\theta)\hat{u}(\lambda)\;\mathrm{d}\lambda$ . Since we assumed $\{0\}\subsetneq\mathrm{U}$ , we can successively take $u$ as the zero sequence ( $u_{k}=0$ for all $k$ , implying $\hat{u}\equiv 0$ ) or proportional to the impulse sequence ( $u_{0}\neq 0$ , $u_{k}=0$ for all $k\neq 0$ , implying $\hat{u}\equiv u_{0}/(2\pi)$ ), the previous display successively leads to (4.6) and

[TABLE]

which implies (4.7).

Step 2: Assertion (ii) implies Assertion (i). Let $\theta,{\theta_{\star}}\in\Theta$ satisfying (4.6) and (4.7), and let us show that Assertion (ii) holds. First take $u\in\ell^{1}(\mathbb{Z},\mathrm{U})$ . By Section 6.4, (4.6) and (4.7) imply that the recursive equation (6.6) and the one with $\theta$ replaced by ${\theta_{\star}}$ share the same bounded solution. Moreover, we have $u\in\mathrm{D}^{\theta_{\star}}\cap\mathrm{D}^{\theta}$ and since $\tilde{\Psi}^{\theta}\langle u_{(-\infty):0}\rangle$ $\tilde{\Psi}^{{\theta_{\star}}}\langle u_{(-\infty):0}\rangle$ are given by the same solution they are equal. Hence we obtain that

[TABLE]

where $\mathrm{E}^{\theta_{\star}}_{1}=\left\{\tilde{\Psi}^{{\theta_{\star}}}\langle v\rangle:v\in\ell^{1}(\mathbb{Z}_{\leq 0},\mathrm{U})\right\}$ . To get Assertion (i), since $z\mapsto\tilde{\Psi}^{\theta^{\prime}}_{u}(z)$ is continuous for $\theta^{\prime}=\theta,{\theta_{\star}}$ and for any $u\in\mathbb{R}^{q^{\prime}}$ , it is now sufficient to prove that $\mathrm{E}^{\theta_{\star}}_{1}$ is dense in $\mathrm{E}^{\theta_{\star}}$ . To this end, pick $z\in\mathrm{E}^{\theta_{\star}}$ . Then there exists $u\in\mathrm{D}^{\theta_{\star}}$ such that

[TABLE]

Define, for any $n\in\mathbb{Z}_{\geq 0}$ , we introduce the truncated sequence

[TABLE]

Then $v^{(n)}\in\ell^{1}(\mathbb{Z},\mathrm{U})$ and we have

[TABLE]

Moreover, we can write, denoting by $0_{(-\infty):0}$ the null sequence in $\mathrm{U}^{\mathbb{Z}_{\leq 0}}$ ,

[TABLE]

By (2.26) and (6.7) we thus have $z=\lim_{n\to\infty}z_{n}$ , and since $z$ is arbitrary in $\mathrm{E}^{\theta_{\star}}$ we have shown that $\mathrm{E}^{\theta_{\star}}_{1}$ is dense in $\mathrm{E}^{\theta_{\star}}$ and the proof is concluded. ∎

Acknowledgments

In the first version of this contribution, we neither investigated the VLODM case nor the case with exogenous covariates. We are grateful to the two referees that reviewed the first submission, whose fruitful and constructive comments motivated these demanding extensions.

Bibliography32

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Agosto et al. [2016] Arianna Agosto, Giuseppe Cavaliere, Dennis Kristensen, and Anders Rahbek. Modeling corporate defaults: Poisson autoregressions with exogenous covariates (parx). Journal of Empirical Finance , 38:640 – 663, 2016. ISSN 0927-5398. doi: https://doi.org/10.1016/j.jempfin.2016.02.007 . URL http://www.sciencedirect.com/science/article/pii/S 0927539816300214 . Recent developments in financial econometrics and empirical finance.
2Berkes et al. [2003] István Berkes, Lajos Horváth, and Piotr Kokoszka. GARCH processes: structure and estimation. Bernoulli , 9(2):201–227, 2003. ISSN 1350-7265. doi: 10.3150/bj/1068128975 . URL https://doi.org/10.3150/bj/1068128975 . · doi ↗
3Bhaskaran et al. [2013] Krishnan Bhaskaran, Antonio Gasparrini, Shakoor Hajat, Liam Smeeth, and Ben Armstrong. Time series regression studies in environmental epidemiology. International journal of epidemiology , page dyt 092, 2013.
4Bollerslev [2008] Tim Bollerslev. Glossary to arch (garch). Technical report, CREATES Research Paper, September 2008.
5Bougerol and Picard [1992] P. Bougerol and N. Picard. Stationarity of garch processes and of some nonnegative time series. J. Econometrics , 52(1992):115 – 127, 1992. ISSN 0304-4076. doi: 10.1016/0304-4076(92)90067-2 .
6Christou and Fokianos [2015 a] Vasiliki Christou and Konstantinos Fokianos. Estimation and testing linearity for non-linear mixed Poisson autoregressions. Electron. J. Stat. , 9(1):1357–1377, 2015 a. doi: 10.1214/15-EJS 1044 . URL https://doi.org/10.1214/15-EJS 1044 . · doi ↗
7Christou and Fokianos [2015 b] Vasiliki Christou and Konstantinos Fokianos. On count time series prediction. Journal of Statistical Computation and Simulation , 85(2):357–373, 2015 b.
8Cox [1981] DR Cox. Statistical analysis of time-series: some recent developments. Scand. J. Statist. , 8(2):93–115, 1981. ISSN 0303-6898.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Necessary and sufficient conditions for the identifiability of

Abstract.

Key words and phrases:

2000 Mathematics Subject Classification:

1. Introduction

2. Preliminaries

2.1. Formal definitions of observation driven models

Definition \thedefinition (ODM, ODMX).

Remark 1**.**

Definition \thedefinition ((V)LODM(X)).

2.2. Iterations of the link function

Remark 2**.**

2.3. Ergodic assumption and some interesting class of parameters

**Definition \thedefinition **(Stationary distributions Pθ\mathbb{P}^{\theta}Pθ and

**Definition \thedefinition **(Equivalent classes for

Remark 3**.**

Lemma \thelemma.

Proof.

**Definition \thedefinition **(Subset

Definition \thedefinition (Set Eθ\mathrm{E}^{\theta}Eθ).

Remark 4**.**

Lemma \thelemma.

Remark 5**.**

3. Examples

3.1. Standard LODMs

Definition \thedefinition (Standard LODM (with unknown observation kernel)).

3.2. A bivariate example

Definition \thedefinition (Standard VLODM (with unknown observation kernel)).

3.3. Non-linear GARCH

3.4. The SETPAR model

3.5. The PARX model

4. Main results

4.1. General setting

Proposition \theproposition.

Remark 6**.**

Theorem 1**.**

Proof.

Corollary \thecorollary.

Proof.

Remark 7**.**

4.2. Vector linear setting

Lemma \thelemma.

Lemma \thelemma.

Lemma \thelemma.

Proof.

5. Applications

5.1. Standard LODMs

Theorem 2**.**

Remark 8**.**

Proof of Theorem 2.

5.2. The bivariate example of Section 3.2

Theorem 3**.**

Remark 9**.**

5.3. The non-linear GARCH of Section 3.3

5.4. The SETPAR model of Section 3.4

Theorem 4**.**

Remark 10**.**

Proof of Theorem 4.

5.5. The PARX model of Section 3.5

Theorem 5**.**

Remark 11**.**

Proof of Theorem 5.

6. Postponed Proofs

6.1. Proof of Section 2.3

Lemma \thelemma.

Proof.

6.2. Proof of Section 4.1

6.3. Proof of Section 4.2

6.4. Proof of Section 4.2

Lemma \thelemma.

Proof of Section 4.2.

Acknowledgments

Remark 1.

Remark 2.

Definition \thedefinition (Stationary distributions $\mathbb{P}^{\theta}$ and

Definition \thedefinition (Equivalent classes for

Remark 3.

Definition \thedefinition (Subset

Definition \thedefinition (Set $\mathrm{E}^{\theta}$ ).

Remark 4.

Remark 5.

Remark 6.

Theorem 1.

Remark 7.

Theorem 2.

Remark 8.

Theorem 3.

Remark 9.

Theorem 4.

Remark 10.

Theorem 5.

Remark 11.