Quasi-Stationary Distributions and Resilience: What to get from a sample?
J.-R. Chazottes, P. Collet, S. Mart\'inez, S. M\'el\'eard

TL;DR
This paper investigates how to estimate the resilience of multi-species birth-and-death processes by linking macroscopic dynamical stability to microscopic fluctuations, especially in large but finite populations.
Contribution
It establishes relations between resilience and process fluctuations, providing estimators to assess resilience from observed data on large stochastic systems.
Findings
Relations between resilience and microscopic fluctuations are derived.
Estimators for resilience are developed for different time scales.
The approach connects stochastic process behavior with dynamical system stability.
Abstract
We study a class of multi-species birth-and-death processes going almost surely to extinction and admitting a unique quasi-stationary distribution (qsd for short). When rescaled by and in the limit , the realizations of such processes get close, in any fixed finite-time window, to the trajectories of a dynamical system whose vector field is defined by the birth and death rates. Assuming that this dynamical has a unique attracting fixed point, we analyzed in a previous work what happens for large but finite , especially the different time scales showing up. In the present work, we are mainly interested in the following question: Observing a realization of the process, can we determine the so-called engineering resilience? To answer this question, we establish two relations which intermingle the resilience, which is a macroscopic quantity defined for the dynamicalâŠ
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Thermodynamics and Statistical Mechanics · Stochastic processes and statistical mechanics · Markov Chains and Monte Carlo Methods
Quasi-Stationary Distributions and Resilience: What to get from a sample?
J.-R. Chazottes Email: [email protected] CPHT, CNRS, Ecole Polytechnique, Institut Polytechnique de Paris, F-91128 Palaiseau, France
P. Collet Email: [email protected] CPHT, CNRS, Ecole Polytechnique, Institut Polytechnique de Paris, F-91128 Palaiseau, France
S. Méléard Email: [email protected] CMAP, CNRS, Ecole Polytechnique, Institut Polytechnique de Paris, F-91128 Palaiseau, France
S. MartĂnez Email: [email protected] DIM-CMM, Universidad de Chile, UMI 2807 UChile-CNRS, Beauchef 851, Santiago, Chile
Abstract
We study a class of multi-species birth-and-death processes going almost surely to extinction and admitting a unique quasi-stationary distribution (qsd for short). When rescaled by and in the limit , the realizations of such processes get close, in any fixed finite-time window, to the trajectories of a dynamical system whose vector field is defined by the birth and death rates. Assuming this dynamical system has a unique attracting fixed point, we analyzed the behavior of these processes for finite and finite times, âinterpolatingâ between the two limiting regimes just mentioned. In the present work, we are mainly interested in the following question: Observing a realization of the process, can we determine the so-called engineering resilience? To answer this question, we establish two relations which intermingle the resilience, which is a macroscopic quantity defined for the dynamical system, and the fluctuations of the process, which are microscopic quantities. Analogous relations are well known in nonequilibrium statistical mechanics. To exploit these relations, we need to introduce several estimators which we control for times between (time scale to converge to the qsd) and (time scale of mean time to extinction).
Keywords: birth-and-death process, dynamical system, engineering resilience, quasi-stationary distribution, fluctuation-dissipation relation, empirical estimators.
Contents
Titre en français :
Distributions quasi-stationnaires et résilience : que peut-on obtenir des données ?
Résumé en français :
Nous Ă©tudions une classe de processus de naissance et mort avec plusieurs espĂšces dans la situation oĂč lâextinction est certaine et la distribution quasi-stationnaire est unique. Si on fixe un intervalle de temps fini et quâon normalise les rĂ©alisations dâun tel processus par un paramĂštre dâĂ©chelle , elles deviennent arbitrairement proches, dans la limite , des trajectoires dâun certain systĂšme dynamique dont le champ de vecteurs est dĂ©fini Ă partir des taux de naissance et mort. Quand le systĂšme dynamique admet un seul point fixe attractif, nous avons prĂ©cĂ©demment analysĂ© le comportement du processus pour des valeurs de finies et pour des temps finis, câest-Ă -dire, le comportement intermĂ©diaire entre les deux comportements limites Ă©voquĂ©s ci-dessus. La question principale qui nous intĂ©resse dans le prĂ©sent article est la suivante : si on observe une rĂ©alisation du processus, pouvons-nous estimer la rĂ©silience au sens de lâingĂ©nieur (engineering resilience) ? Pour rĂ©pondre Ă cette question, nous dĂ©montrons deux relations entremĂȘlant la rĂ©silience, qui est une quantitĂ© macroscopique dĂ©finie pour le systĂšme dynamique sous-jacent, et les fluctuations du processus, qui sont elles des quantitĂ©s microscopiques. De tels genres de relations sont bien connues en mĂ©canique statistique hors dâĂ©quilibre. Afin dâexploiter ces relations nous introduisons plusieurs estimateurs empiriques que nous parvenons Ă contrĂŽler pour des temps entre , qui est lâĂ©chelle de temps pour observer la convergence vers la distribution quasi-stationnaire, et , qui est lâĂ©chelle du temps moyen dâextinction.
Mots clés :
processus de naissance et mort, systÚmes dynamiques, résilience, distribution quasi-stationnaire, relation de fluctuation-dissipation, estimateurs empiriques.
1 Introduction and main results
1.1 Context and setting
The ability of an ecosystem to return to its reference state after a perturbation stress is given by its resilience, a concept pioneered by Holling. Resilience has several faces and multiple definitions [5]. In the traditional theoretical setting of dynamical systems, that is, differential equations, one of them is the so-called engineering resilience. It is concerned with what happens in the vicinity of a fixed point (equilibrium state) of the system, and is given by minus the real part of the dominant eigenvalue of the Jacobian matrix evaluated at the fixed point. It can also be defined as the reciprocal of the characteristic return time to the fixed point after a (small) perturbation. In this paper, we are interested in how to determine the engineering resilience from the data. But which data? The drawback of the notion of engineering resilience is that we do not observe population densities governed by differential equations. Instead, we count individuals which are subject to stochastic fluctuations. Can we nevertheless infer the resilience? The subject of this paper is to show that this is possible in the framework of birth-and-death processes which are, in a sense made precise below, close to the solutions of a corresponding differential equation, at certain time and population size scales.
Let us now describe our framework. We consider a population made of species interacting with one another. Suppose that the state of the process at time , which we denote by , is , where is the number of individuals of the th species. Then the rate at which the population increases (respectively decreases) by one individual of the th species is (respectively ), where is a scaling parameter. Under the assumptions we will make, the process goes extinct, i.e., is an absorbing state, with probability one. There are two limiting regimes for the behavior of this process. The first one is to fix and let tend to infinity, which leads inevitably to extinction. The second one consists in fixing a time horizon and letting tend to , after having rescaled the process by . In this limit, the behavior of the rescaled process is governed by a certain differential equation. More precisely, given any and any and , we have
[TABLE]
where is the Euclidean distance in , and is the solution of the differential equation in
[TABLE]
with initial condition . We refer to [4, Chapter 11] for a proof. We use the notations , , and so on and so forth. We will make further assumptions (see Subsection 1.4) on the birth and death rates to be in the following situation. The vector field
[TABLE]
has a unique attracting fixed point (lying in the interior of ). We denote by its differential evaluated at , namely
[TABLE]
We then define the (engineering) resilience as
[TABLE]
where \mathrm{Sp}\big{(}M^{*}\big{)} denotes the spectrum (set of eigenvalues) of the matrix . Under our assumptions, we have .
We can now formulate more precisely the goal of this paper. Given a finite-length realization of the process , with large, but finite , we want to build an estimator for . To this end, we need a good understanding of the behavior of the Markov process in an intermediate regime between the two limiting regimes described above. This was done in a previous work of ours [3], and this can be roughly summarized as follows. All states are transient and is absorbing, hence the only stationary distribution is the Dirac measure sitting at . The mean time to extinction behaves like . (We recall Bachmann-Landau notations below.) If we start in the vicinity of the state , that is, if the initial state has its coordinates of size of order , then either the process wanders around or it gets absorbed at . More precisely, there is a unique quasi-stationary distribution (qsd, for short) which describes the statistics of the process conditioned to be non-extinct before time . Without this conditionning, the law of the process at time is well approximated by a mixture of the Dirac measure at and the qsd , for times , where , in the sense that the total variation distance between them is exponentially small in , provided that is large enough. We will rely on these results that will be recalled precisely later on. We will also need to prove further properties.
1.2 Main results
To estimate the engineering resilience , we will establish a matrix relation involving . Let be the vector of species sizes averaged with respect to , that is,
[TABLE]
For each , define the matrix
[TABLE]
In Section 4.1, we will prove the following result.
Theorem 1.1**.**
For all we have
[TABLE]
Some comments are in order. If is equal to, say, then the estimate becomes useless. More generally, if is too small then is too close to the identity matrix. Moreover, we will show later on that and are of order . Hence the estimate becomes irrelevant if becomes proportional to . Indeed, without knowing the constant of proportionality, can be of the same order than the error term.
Before proceeding further, we recall the following classical Bachmann-Landau notations.
Notations**.**
Given , the symbol stands for any real-valued function such that there exists and such that, for any , . Note in particular that will always mean a strictly positiveconstant that we donât want to specify. Sometimes, we will also use the symbol stands for any real-valued function such that there exist and such that, for any , . One can naturally generalize to vector-valued functions. For instance, for we write if for .
Relation (1.2) allows to determine . Indeed, we have
[TABLE]
This formula suggests that in order to estimate , we need estimators for and . Given a finite-length realization of \big{(}\ushort{N}^{{\scriptscriptstyle K}}(t),0\leq t\leq T\big{)} up to some time , we define estimators for and , for , , by
[TABLE]
and
[TABLE]
Under suitable conditions on , and , well approximates . More precisely, we will prove an estimate of the following form (see Theorem 3.4 for a precise statement)
[TABLE]
for every , , where are positive constants. We use the notation . Let us comment on this bound. Roughly speaking, it can only be useful if is much smaller than if is, say, of order . For instance, suppose that, for large enough, we want the bound to be , for some . One can check that this is possible if and . (Note in particular that, in this situation, we have a consistent estimator when .) However, when becomes or larger, we know that {\mathds{E}}_{\ushort{n}}\big{[}S^{{\scriptscriptstyle\ushort{\mu}}}_{p}(T,K)\big{]}\approx 0, because with high probability, at this time scale the process is absorbed at . This is the manifestation of the fact that the only stationary distribution is the Dirac measure at . Consistently, our bound becomes very bad in that regime.
An estimate of the same kind holds for which well approximates in the appropriate ranges.
Remark 1.1**.**
It is possible to use discrete time instead of continuous time in the above averages. Indeed the key results (in particular Proposition 3.3) are obtained for discrete times.
We can now define the empirical matrix by
[TABLE]
We will see later on that, in appropriate regimes, is near and is near (see Propositions 5.4 and 5.6). The matrix is invertible as a covariance matrix of a non-constant vector and is (see Proposition 2.9). Then (1.2) implies that is invertible and the same holds for . These remarks imply that the matrix is well defined.
We define the empirical resilience by
[TABLE]
Our main result (Theorem 5.7) is then the following.
Theorem**.**
For , (initial state) and , and large enough, we have
[TABLE]
with a probability larger than . In particular, if , we have
[TABLE]
Several comments are in order. The dependence on the initial state is somewhat hidden and involved in the fact that the estimates hold âwith a probability larger than â. Indeed, the estimate of this probability results from Chebychev inequality and variance estimates in which the process is started in . What the symbol precisely means is not mathematically defined. It means that we need to consider âmuch smaller than something exponentially big in â. Indeed, since we do not control explicitly the various constants appearing in exponential terms in , we have to consider which varies on a scale smaller than , for instance \exp\big{(}\Theta(1)\sqrt{K}\,\big{)}. The reader is invited to step through the proof of Theorem 5.7 for the more precise, but cumbersome bound we obtain.
1.3 A âfluctuation-dissipationâ approach
The above estimator for the engineering resilience, based on (1.3), is valid for any . In the case (only one species), we have another, simpler, estimator based on a âfluctuation-dissipation relationâ. This relation is in fact true for any and of independent interest. Let be the diagonal matrix given by
[TABLE]
We have the following result. We write instead of , and the transpose of a matrix is denoted by .
Theorem 1.2**.**
We have
[TABLE]
This relation is proved in Section 4.2. For background on fluctuation-dissipation relations in Statistical Physics, we refer to [7, sections 2-3]. Note that the matrix is symmetrical, but in general the matrix is not (see [3]). Note also that each term in the left-hand side of (1.7) is of order , as we will see below.
If and are known, the matrix is not uniquely defined, except for (see for example [8]). For , (1.7) easily gives the resilience since it becomes a scalar equation:
[TABLE]
Remark 1.2**.**
The quantity is the average total jump rate up to terms. This follows from a Taylor expansion of around , Theorem 2.6 and Proposition 2.7.
In the case , an estimator for is
[TABLE]
In Section 5, we establish a bound for
[TABLE]
which depends on and , and is small in the relevant regimes. The estimator we use for is
[TABLE]
Again, we can control how well this estimator approximates . This provides another estimator for , with a controlled error.
1.4 Standing assumptions
The two (regular) vector fields and are given in . We assume that their components have second partial derivatives which are polynomially bounded. Obviously, we suppose that and for all and . A dynamical system in is defined by the vector field , namely
[TABLE]
For , we use the following standard norms:
[TABLE]
We now state our hypotheses.
- H.1
The vectors fields and vanish only at . 2. H.2
There exists belonging to the interior of (fixed point of ) such that
[TABLE] 3. H.3
Attracting fixed point: there exist and such that , and for all with ,
[TABLE] 4. H.4
The fixed point of the vector field is repelling (locally unstable). Moreover, on the boundary of , the vector field points toward the interior (except at ). 5. H.5
Define
[TABLE]
and for , let
[TABLE]
We assume that there exists such that and 6. H.6
There exists such that and is increasing on . 7. H.7
There exists such that
[TABLE] 8. H.8
Finally, we assume that
[TABLE]
(By we mean the partial derivative with respect to .)
Assumptions H.5 and H.6 ensure that the time for âcoming down from infinityâ for the dynamical system is finite. Together with H.3, this also implies that is a globally attracting stable fixed point. More comments on these assumptions can be found in [3].
1.5 A numerical example
We consider the two-dimensional vector fields
[TABLE]
and
[TABLE]
where all the coefficients are positive. This is a model of competition between two species of Lotka-Volterra type. We have taken
[TABLE]
Assumptions H.1 and H.4 are easily verified numerically. Assumptions H.5 and H.6 are satisfied because and . Concerning H.2, we checked numerically that there is a unique fixed point inside the positive quadrant, namely . It remains to check H.3, namely that
[TABLE]
where
[TABLE]
We first checked that the numerator is negative and vanishes only at and . It is easy to check that for large enough. We have verified numerically that the only solutions of the equations in the closed positive quadrant are and , with , thus this is a negative local minimum. This implies that in the closed positive quadrant, except at and where it vanishes. This implies that in the closed positive quadrant. It is easy to check that
[TABLE]
This implies that except perhaps at and . Near we have by Taylor expansion
[TABLE]
and, since the vector has positive components, there exists such that for all
[TABLE]
If is small, we have by Taylor expansion (since )
[TABLE]
One can check numerically that the two real eigenvalues of the symmetric matrix
[TABLE]
are strictly negative, the largest being numerically equal to . This completes the verification of hypothesis H.3.
Illustrating standard experiments on populations of cells or bacteria, we have chosen and simulated a unique realization of the process with which contains about jumps (cell divisions or deaths). The resilience computed from the vector field is numerically equal to . We have computed . The relative error, that is , is equal to .
Note that the situation we are interested in is completely different from standard statistical approach where one can repeat the experiments.
1.6 Organization of the paper
In Section 2, we will study the time evolution of the moments of the process and we will prove moment estimates for the qsd. In Section 3, we will obtain control on the large time behavior of averages for the process. In Section 4, we will prove the relations (1.2) and (1.7). In Section 5, we will apply these relations to obtain approximate expressions of the engineering resilience in terms of the covariance matrices for the qsd. From the results of Section 3, we will deduce variance bounds for the estimators (1.4), (1.5) and (1.8), starting either in the qsd or from an initial condition of order .
2 Time evolution of moments of the process and moments of the QSD
2.1 Time evolution of moments starting from anywhere
The generator of the birth and death process is defined by
[TABLE]
where , the being at the -th position, and is a function with bounded support. We denote by the semigroup of the process acting on bounded functions, that is, for , we have
[TABLE]
For , let
[TABLE]
Notice that we will use either or . They are of course equivalent but one can be more convenient than the other, depending on the context. We have the following result.
Theorem 2.1**.**
There exists a constant such that for large enough, the operator group extends to exponentially bounded functions and
[TABLE]
Proof.
Introduce the function defined on by
[TABLE]
Assumption H.6 implies that is well defined and decreasing on . We can define its inverse function on for small enough (independent of ). Take . Then there is a unique positive function defined by
[TABLE]
Note that and . Let
[TABLE]
Note that
[TABLE]
Using the Lipschitz continuity of (and then its differentiability almost everywhere) and (2.3), we obtain
[TABLE]
We now consider the function
[TABLE]
to which we apply ItĂŽâs formula at time . We get
[TABLE]
We have
[TABLE]
Note that
[TABLE]
It follows from H.5 that there exists a number such that if , then .
If we get
[TABLE]
For we have
[TABLE]
since and .
Finally, for we get
[TABLE]
We deduce that
[TABLE]
The result follows by letting tend to infinity and by monotonicity. â
We deduce moment estimates for the process which are uniform in the starting state, and in time, for times larger than .
Corollary 2.2**.**
For all , the semi-group maps functions of polynomially bounded modulus in bounded functions. In particular, for all , we have
[TABLE]
Proof.
We have
[TABLE]
since for all , . Inequality (2.4) follows from Hölderâs inequality and Theorem 2.1. Let us now consider . From the Markov property and by using the previous inequality, we deduce that
[TABLE]
The proof is finished. â
For times less than , the moment estimates depends on the initial state.
Proposition 2.3**.**
For each integer , there exists a constant such that for all , and
[TABLE]
Proof.
We have only to study the case , the other case being given in (2.4). We prove the result for even, namely . The result for odd follows from Cauchy-Schwarz inequality. Letting
[TABLE]
we have
[TABLE]
Using H.5 and the equivalence of the norms, we see that there exists a constant such that if
[TABLE]
Moreover, we can take large enough such that for all
[TABLE]
Applying ItĂŽâs formula to we get as in the proof of Theorem 2.1
[TABLE]
(Recall that is defined in (2.2).) The result follows by letting tend to infinity. â
2.2 Moments estimates for the qsd
Let us first recall (cf. [3]) that, under the assumptions of Section 1.4, there exists a unique qsd with support . Further, starting from the qsd, the extinction time is distributed according to an exponential law with parameter satisfying (Theorem 3.2 in [3])
[TABLE]
where are constants independent of . Recall also that for all ,
[TABLE]
where
[TABLE]
Finally, for all in the domain of the generator
[TABLE]
with the notation
[TABLE]
We use several notations from [3] that we now recall. Let
[TABLE]
For and , is the ball of center and radius . We consider the sets
[TABLE]
where is a constant defined in [3, Corollary 4.2]. Note that since is of order , we have for large enough. The first entrance time in (resp. ) will be denoted by (resp. ).
We first prove that the support of the qsd is, for large , almost included in . (This will be important to control moments later on.)
Proposition 2.4**.**
There exists a constant such that for all large enough
[TABLE]
Proof.
We first recall two results from [3]. From Lemma 5.1 in [3], there exist and such that for all large enough
[TABLE]
By Sublemma 5.8 in [3], there exist two constants and such that for all large enough, and for all
[TABLE]
Now, for define
[TABLE]
We will first estimate \sup_{\ushort{n}}{\mathds{P}}_{\!\ushort{n}}\big{(}\ushort{N}^{{\scriptscriptstyle K}}(t_{q})\in\mathcal{D}^{c},T_{\ushort{0}}>t_{q}\big{)}. Note that implies . We distinguish two cases according to whether or .
Let . It follows from (2.10) that
[TABLE]
Now let . We have
[TABLE]
Using the strong Markov property at time and (2.10) we obtain
[TABLE]
We bound the second term recursively in .
[TABLE]
where we used the strong Markov property at time and (2.9). This implies
[TABLE]
Therefore
[TABLE]
Taking we conclude that there exists a constant such that for large enough
[TABLE]
This implies
[TABLE]
but by (2.6)
[TABLE]
and the result follows from (2.5). â
Corollary 2.5**.**
For each , there exists such that for all large enough
[TABLE]
Proof.
It follows at once from (2.6) (at time ) and Theorem 2.1 that
[TABLE]
for large enough. We have
[TABLE]
We use Hölder inequality to get
[TABLE]
The first result follows from (2.11) and Proposition 2.4. The second estimate follows from the first one, and the bound . â
We now estimate centered moments.
Theorem 2.6**.**
For each , there exists such that for all large enough
[TABLE]
Proof.
The proof consists in a recursion over . The bound is trivial for . For define the function
[TABLE]
where
[TABLE]
Recall that is the vector with at the th coordinate and [math] elsewhere. From the trivial identity
[TABLE]
it follows that
[TABLE]
Indeed, applying the trinomial expansion to (2.12), we obtain
[TABLE]
Observe that if and then , since . This implies that
[TABLE]
It follows that
[TABLE]
where
[TABLE]
To get this bound, we used the fact that
[TABLE]
Using (1.10) we get
[TABLE]
where
[TABLE]
Integrating the equation (2.13) with respect to and using (2.7), (2.14), (2.15) and Proposition 2.4, we obtain
[TABLE]
Observing that , it follows by recursion over that, for each integer , there exists such that, for all large enough, . Finally we have
[TABLE]
since . The result follows using the previous estimate and Corollary 2.5. â
The next result gives a more precise estimate for the average of (instead of an error of order ).
Proposition 2.7**.**
We have
[TABLE]
where is defined in (1.1). Moreover, since , we have
[TABLE]
Proof.
Define the functions
[TABLE]
By Taylor expansion and the polynomial bounds on and we get
[TABLE]
for some positive integer independent of . Using Cauchy-Schwarz inequality, identity (2.7), Corollary 2.5 and Proposition 2.4 we get
[TABLE]
From Proposition 2.4, Theorem 2.6 and (2.5) we get
[TABLE]
The result follows from the invertibility of the matrix which follows from H.3. The other inequalities follow immediately. â
Corollary 2.8**.**
For all , we have
[TABLE]
Proof.
Combine Proposition 2.7 and Theorem 2.6. â
We now show that is indeed of order .
Proposition 2.9**.**
There exist two strictly positive constants and such that for all large enough, the matrix satisfies
[TABLE]
for the order among positive definite matrices, being the identity matrix, and, in particular,
[TABLE]
Proof.
We denote by the positive definite matrix
[TABLE]
By (2.16) we have
[TABLE]
Let be a unit vector in d. We have
[TABLE]
From Lemma 5.3 in [3] there exists a constant such that for all large enough and all ,
[TABLE]
where is the uniform distribution on . Therefore
[TABLE]
and we get
[TABLE]
The result follows. â
3 Controlling time averages of the estimators
For , we define the time average of a function by
[TABLE]
The goal of this section is to obtain a control of for a suitable class of functions.
We recall the following result from [3, Theorem 3.1].
Theorem 3.1** ([3]).**
There exist , such that, for all and for all , we have
[TABLE]
It is also proved in [3] that, for a time much larger than and much smaller than the extinction time (which is of order ), the law of the process at time is close to the qsd. The accuracy of the approximation depends on the initial condition. This suggests to study the distance between the law of the process at time and the qsd as a function of the initial condition, and . This will result from (3.2) if {\mathds{P}}_{\!\ushort{n}}\big{(}T_{\ushort{0}}\leq t\big{)} can be estimated. In fact we prove a more general result.
Lemma 3.2**.**
For , define \tau_{\gamma}=\inf\big{\{}t\geq 0:\|\ushort{N}^{{\scriptscriptstyle K}}(t)\|_{\scaleto{1}{4pt}}\leq\gamma K\big{\}}. There exist , and such that for all , , and , we have
[TABLE]
where
[TABLE]
Taking in (3.3), we get
[TABLE]
Proof.
It follows from H.1 and H.3 (using Taylorâs expansion of near ) that there exists (where was introduced in Assumption H.3) such that for all satisfying we have
[TABLE]
For and to be chosen later on, we define
[TABLE]
It is easy to verify that if we have
[TABLE]
If we have
[TABLE]
For , we have , where is defined in (3.4), and
[TABLE]
where the function is defined by
[TABLE]
We have
[TABLE]
From the differentiability of the vector fields and and using (3.6), it follows that there exists a constant such that, for all and we have
[TABLE]
Therefore we can choose and such that
[TABLE]
Therefore, for all , we have
[TABLE]
For (independent of ), we define
[TABLE]
We apply Itoâs formula to to get
[TABLE]
We have
[TABLE]
hence
[TABLE]
Then
[TABLE]
Therefore
[TABLE]
To conclude, observe that
[TABLE]
for because for all ,
[TABLE]
and . â
We have the following result.
Proposition 3.3**.**
For all bounded functions , , , and , we have
[TABLE]
where and are defined in Lemma 3.2, and and are defined in Theorem 3.1.
Proof.
From the bound (3.2) we get
[TABLE]
This implies
[TABLE]
using (3.5). â
We now extend Proposition 3.3 to more general functions. For , we define the Banach space by
[TABLE]
We have the following result for time-averages of functions in .
Theorem 3.4**.**
For all , , , and , we have
[TABLE]
where and are defined in Lemma 3.2, and is defined in (2.5).
Remark 3.1**.**
One can check that if one modifies slightly the definition of the time average (3.1) by integrating from to , then one can remove the term from the previous estimate.
Proof.
For , Corollary 2.5 gives
[TABLE]
By Proposition 2.3 we have
[TABLE]
Hence for we get
[TABLE]
For , we have by the Markov property that
[TABLE]
where we set
[TABLE]
By Corollary 2.2, the function is bounded and
[TABLE]
Applying Proposition 3.3 to thus gives
[TABLE]
Integrating over yields
[TABLE]
Using Lemma 3.5 (stated and proved right after this proof), we finally obtain
[TABLE]
This finishes the proof of the theorem. â
We used the following lemma in the previous proof.
Lemma 3.5**.**
For and defined in (3.8) we have
[TABLE]
Proof.
We write
[TABLE]
Since is a qsd, it follows by Cauchy-Schwarz inequality that
[TABLE]
where we used Corollaries 2.2 and 2.5 and the fact that under the law of is exponential with parameter . The lemma is proved. â
4 Fluctuation and correlation relations
4.1 Proof of Theorem 1.1
Let
[TABLE]
For , let . We have, since , and ,
[TABLE]
As in the previous proof, we split the integrals according to whether or . Using Cauchy-Schwarz inequality, Corollary 2.5, and the fact that is a qsd, the second contribution is exponentially small in . In the first contribution, we use Taylor expansion around . The error terms are bounded by
[TABLE]
Now we use Cauchy-Schwarz inequality, Theorem 2.6 and that is a qsd to obtain
[TABLE]
Since has a spectrum contained in the open left half-plane by H.3, we integrate the equation
[TABLE]
from [math] to using the method of constant variation and obtain
[TABLE]
We arrive at the desired relation by using (2.16).
4.2 Proof of Theorem 1.2
Recall that
[TABLE]
We will first do the proof with the following matrix instead of :
[TABLE]
On the one hand we have by (2.7)
[TABLE]
By Theorem 2.6 and (2.5) the right-hand side of this equation is exponentially small in . On the other hand, using formula (2.1) we have
[TABLE]
We split each integral by separating integration over (defined in (2.8)) and . Inside , we apply Corollary 2.5 and use the assumption that and are polynomially bounded. Inside , we use Taylorâs formula around for the functions , and . We also use that , and \ushort{n}^{\scaleto{*}{3pt}}/K-\ushort{x}^{\scaleto{*}{3pt}}=\mathcal{O}\big{(}\frac{1}{K}\big{)}. The error terms are then bounded by
[TABLE]
respectively. Using Theorem 2.6, both bounds are of order . We obtain
[TABLE]
which can be written in the more compact form
[TABLE]
where is the diagonal matrix of averages birth (or death) rates. To finish the proof, it remains to replace by . This is done by using (2.17).
Remark 4.1**.**
Note that each term on the left hand side is of order , see Corollary 2.8.
Remark 4.2**.**
We will see in Appendix C that the qsd around is well approximated at scale by a Gaussian distribution. Dividing out (4.1) by and taking the limit , we recover Relation (C.1), as expected from Theorem C.1.
5 Variance estimates for the estimators
It is straightforward to apply Theorem 3.4 to , , , and , which are defined respectively in (1.4), (1.5), (1.8), and (1.9). This gives the bound (1.6) on anounced in Section 1. The bounds for the other estimators all have the same structure. We will not state them.
In this section we prove two variance estimates for any time average with . In the first one, one starts from anywhere in , while in the second one the starting distribution is the qsd. Recall that . We will only give the proofs of these estimates for , since manipulating is cumbersome but otherwise the proofs are the same.
Proposition 5.1**.**
There exist strictly positive constants and such that, for all , (see Definition 3.7), , and , we have
[TABLE]
where was defined in Proposition 2.3.
One can use Chebyshev inequality to bound {\mathds{P}}_{\!\ushort{n}}\big{(}\big{|}S_{f}(T,K)-\nu\!_{\scaleto{K}{3.5pt}}(f)\big{|}>\delta\big{)}, for any .
The proof of Proposition 5.1 is postponed to Appendix A. The previous estimate, as well as all the estimates we will give below, have the same behaviour in their dependence in , and . They display the qualitative behaviour that we met several times:
The bounds are not useful for too small. 2. 2.
If is large, the bounds are not useful if is small (order one) because the process can be absorbed at in a time of order one with a sizeable probability. 3. 3.
Finally, for large and of order , the time must be large enough (polynomial in in our bounds) but not too large (less than an exponential in because the process can reach the origin with high probability in such large times).
Integrating the previous estimate with respect to the qsd, we get the following control.
Corollary 5.2**.**
There exist two positive constants and such that for all , for all and for all , we have
[TABLE]
where is as in the previous proposition, is defined in Proposition 2.3, and is defined in Corollary 2.5.
Observe that the previous inequality is only useful in the range . The proofs of the two previous estimates are postponed to Appendix A.
We now apply the previous results to our estimators.
Proposition 5.3**.**
For all , we have
[TABLE]
and
[TABLE]
Proof.
The proof follows by applying Proposition 5.1 and Corollary 5.2 to the functions , , which belong to . â
Proposition 5.4**.**
For and for all , we have
[TABLE]
and
[TABLE]
Proof.
The proof follows by applying Proposition 5.1 and Corollary 5.2 to the functions , , which belong to . â
Proposition 5.5**.**
There exist positive constants , , and such that for all , and ,
[TABLE]
where
[TABLE]
and , , are such that, for all ,
[TABLE]
*The existence of and follows from the assumptions on . The constants and are defined in Corollary 2.5 and Lemma 2.3, respectively.
We also have*
[TABLE]
[TABLE]
where
[TABLE]
Proof.
First observe that
[TABLE]
where is defined in Appendix B. By assumption, the function . Let be any probability measure on having all its moments finite. We apply Theorem 3.4 to the function , and then using integration against we get
[TABLE]
[TABLE]
We now apply the identity in Proposition B.1 and divide by . We obtain
[TABLE]
We now estimate
[TABLE]
The second integral is bounded from above by using the polynomial bound on and the first estimate in Corollary 2.5. For the first integral we use Taylor expansion around to first order, then Cauchy-Schwarz inequality, and finally Theorem 2.6 for . Therefore we obtain
[TABLE]
Now we apply the estimate in Proposition B.1 to obtain
[TABLE]
For the first term we use either Corollary 5.2 or Proposition 5.1. For the second term we use (5.1). For the third and last term we apply Theorem 3.4, integrate with respect to and use (5.2). To finish the proof, we replace by either or . â
Recall that , .
Proposition 5.6**.**
Under the assumptions of Proposition 5.1 and Corollary 5.2, we have, for all , and ,
[TABLE]
and
[TABLE]
Proof.
The proof requires some simple modifications of the proofs of Propostions 5.1 and 5.2. This is left to the reader. â
Remark 5.1**.**
If one modifies slightly the definition of the estimator by integrating from time , then, in the four previous propositions, one can replace the factor by , and the factor by .
Recall that we defined in Section 1 an empirical matrix by
[TABLE]
and an empirical resilience by
[TABLE]
From the above results one can derive various statistical estimates for the difference between and . We have the following result which was stated at the end of Section 1.2. As already mentioned, we use the symbol which is not rigorously defined to formulate a more transparent bound. The reader can easily step through the proof to get a more precise, but rather cumbersome bound. Let us also note that the dependence on the initial state is related to the part âwith a probability larger than â of the statement. Indeed, the estimate of this probability results from Chebychev inequality and variance estimates in which the process is started in .
Theorem 5.7**.**
For , (initial state) and , and large enough, we have
[TABLE]
with a probability higher than . In particular, if , we have
[TABLE]
Proof.
It follows from Propositions 5.4 and 5.6 and the standing assumptions that, with a probability higher that , we have
[TABLE]
and
[TABLE]
( stands for any matrix norm on since they are all equivalent.) We now use Theorem 1.1 and Proposition 2.9 to obtain
[TABLE]
The result follows since is of order one. â
Appendix A Proof of the two variance estimates
A.1 Starting from anywhere: proof of Proposition 5.1
It is enough to prove the result for . We have
[TABLE]
Step 1 is to estimate the contribution of the range . Using Cauchy-Schwarz inequality and Proposition 2.3 we get
[TABLE]
Step 2 is to estimate the contribution in the range . This implies that . We have using again Proposition 2.3
[TABLE]
**Step 3
**(1) Using the Markov property and the definition of (see (3.8)) we have
[TABLE]
Let us first write
[TABLE]
as the sum of and where
[TABLE]
and
[TABLE]
We further decompose as where
[TABLE]
and
[TABLE]
Since is an absorbing state, we have for all that
[TABLE]
(2) We start by estimating . Since is an absorbing state, we have
[TABLE]
Note that . Since we are going to use Lemma 3.2, we write where
[TABLE]
and
[TABLE]
We first estimate . Using (3.9), Lemma 3.2 with , and since belongs to (see (3.7)), we have
[TABLE]
where we used Proposition 2.3 for the second inequality.
We now estimate by splitting it as where
[TABLE]
and
[TABLE]
where
[TABLE]
Proceeding as before we get
[TABLE]
We used Lemma 3.2 with .
We now handle .
Note that . We proceed as before with and , and we use Lemma 3.2 with
[TABLE]
to get
[TABLE]
(3) Let us now estimate for all . We have
[TABLE]
(3)-(i) By Theorem 3.1 and since , we have
[TABLE]
Hence, using Proposition 2.3, we get for all
[TABLE]
(3)-(ii) We have
[TABLE]
[TABLE]
Define by
[TABLE]
We split the right hand side in two terms:
[TABLE]
The first term is estimated using the growth property of , Lemma 3.2, and Cauchy-Schwarz inequality, namely
[TABLE]
To deal with the second term, we observe using Lemma 3.2 and Proposition 2.3 that, if , then
[TABLE]
Now
[TABLE]
(3)-(iii) Let us now prove that for all ,
[TABLE]
For , using Proposition 2.3 we obtain
[TABLE]
We now deal with . The Markov property gives
[TABLE]
where
[TABLE]
is a function bounded by . For , we use Theorem 3.1 and Corollary 2.2 to get
[TABLE]
Since is the qsd, we have
[TABLE]
Using Corollary 2.5, Lemma 3.2 and the properties of we obtain
[TABLE]
and (A.1) is proved.
(3)-(iv) Let us note that
[TABLE]
Proposition 2.3 and Lemma 3.5 give
[TABLE]
(3)-(v) Collecting the informations given in the four previous estimates, we obtain a precise estimation of for all .
(3)-(vi) We have
[TABLE]
Collecting the above relevant estimates we obtain that there exist (all being positive and independent of ) such that
[TABLE]
Now we have
[TABLE]
The final result for follows by collecting all estimates. For the bound follows directly from Proposition 2.3.
A.2 Starting from the qsd: proof of Corollary 5.2
The result follows from Proposition 5.1 by integrating over with respect to the qsd. More precisely, we have
[TABLE]
The integrals of the first and third terms with respect to the q.s.d are estimated using Corollary 2.5. We deal with second term:
[TABLE]
The third integral is estimated using the fact that the integrand is exponentially small in . The second integral is estimated using the first estimate in Corollary 2.5. We finally deal with the first integral. If then . If , on this set we have \operatorname{e}^{-\delta^{\prime}\big{(}\zeta^{\prime}\frac{\|\ushort{n}\|_{\scaleto{1}{3pt}}}{K}\wedge\,\beta^{\prime}\big{)}K}\leq\operatorname{e}^{-\delta^{\prime}\zeta^{\prime}\frac{\|\ushort{n}^{\scaleto{*}{2.2pt}}\|_{\scaleto{2}{3pt}}}{2}} (exponentially small in ). The estimate follows.
Appendix B Counting the number of births
Denote by the number of births of species of type between the times and (, ).
Proposition B.1**.**
For any probability measure on , we have
[TABLE]
and
[TABLE]
Proof.
Recall that the generator of the process is given in (2.1). Let us now give a pathwise representation of the process. We introduce independent point Poisson measures on with intensity . We define the -dimensionnal cĂ d-lĂ g process
[TABLE]
Then the number of births of species of type occuring between the times and is given by
[TABLE]
Using the Markov property we get at once the first identity.
We now establish the estimate. Indeed
[TABLE]
By the -isometry for jump processes (see [6, Formula (3.9) p.62]), we have
[TABLE]
This finishes the proof. â
Appendix C Gaussian limit for the rescaled qsd
We have the following theorem of independent interest. A part of this theorem partially generalizes a result obtained in [2] for models involving a single species (). Recall that .
Theorem C.1**.**
For all , define the measure on the Borel -algebra of d by
[TABLE]
Then converges weakly to the centered Gaussian measure with covariance matrix
[TABLE]
where is the diagonal matrix with entries . The matrix is also the unique symmetric solution of the (Lyapunov) equation (fluctuation-dissipation relation)
[TABLE]
Remark C.1**.**
We have
[TABLE]
This follows by dividing out equation (1.7) by , letting tend to infinity, and using the uniqueness of the (symmetric) solution of (C.1).
Proof.
By Theorem 2.6, the family of measures is tight. For define
[TABLE]
It follows also from Theorem 2.6 that the family of functions is uniformly bounded in . We will prove that
[TABLE]
This will entail that there is only one weak accumulation point for . The proof will be the consequence of Prokhorov Theorem [1]. Using (2.7) and (2.5), we have
[TABLE]
We also have
[TABLE]
Using Taylor expansion, and the moments estimates and the polynomial bounds on and (and ) we obtain
[TABLE]
We conclude that every accumulation point of is bounded in , satisfies , and is a solution of the equation
[TABLE]
Then (C.2) follows from Lemma C.2 (stated and proved right after this proof) with . â
Lemma C.2**.**
Let be strictly positive numbers and a real matrix such that . Then there exists a unique function satisfying and
[TABLE]
This function is given by
[TABLE]
where
[TABLE]
where is the diagonal matrix with entries . The matrix is also the unique symmetric solution of the equation
[TABLE]
Proof.
We use the method of characteristics. For all , we define the function as the solution of
[TABLE]
Let
[TABLE]
Let be a solution of (C.3). It is easy to check that for all and
[TABLE]
Integrating from [math] to yields
[TABLE]
From the spectral properties of we get
[TABLE]
Therefore
[TABLE]
and
[TABLE]
Finally we get from the spectral properties of
[TABLE]
This finishes the proof of the lemma. â
Acknowledgements: We thank the two anonymous referees for fruitful comments and suggestions.
Funding: S. M. has been supported by the Chair âModĂ©lisation MathĂ©matique et BiodiversitĂ©â of Veolia Environnement-Ecole Polytechnique-Museum national dâHistoire naturelle-Fondation X. P. C. and S. M. warmly thank the Basal Conicyt CMM AFD170001 project. J.-R. C. and P. C. also acknowledge the hospitality of the Instituto de FĂsica de San Luis PotosĂ.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] P. Billingsley. Convergence of probability measures. Second edition. Wiley Series in Probability and Statistics: Probability and Statistics. John Wiley & Sons, 1999.
- 2[2] J.-R. Chazottes, P. Collet, S. MĂ©lĂ©ard. Sharp asymptotics for the quasi-stationary distribution of birth-and-death processes. Probab. Theory Related Fields, Vol. 164 (2016), no. 1-2, 285â332.
- 3[3] J.-R. Chazottes, P. Collet, S. MĂ©lĂ©ard. On time scales and quasi-stationary distributions for multitype birth-and-death processes. Annales de lâInstitut Henri PoincarĂ© - ProbabilitĂ©s et statistiques, Vol. 55 (2019), no 4, 2249â2294.
- 4[4] S. N. Ethier, T. G. Kurtz. Markov processes, Characterization and convergence. Wiley & Sons, 1986.
- 5[5] L.H. Gunderson, C.R. Allen, C.S. Holling (Eds). Foundations of Ecological Resilience . Island Press, 2012.
- 6[6] N. Ikeda, S. Watanabe. Stochastic differential equations and diffusion processes. North Holland 2nd edition, 1989.
- 7[7] R. Kubo. The fluctuation-dissipation theorem. Rep. Prog. Phys. 29 , 255â284 (1966).
- 8[8] V. Simoncini. Computational methods for linear matrix equations. SIAM Review 58 (2016), no. 3, 377â441.
