Moderate deviations and extinction of an epidemic
Etienne Pardoux

TL;DR
This paper investigates how stochastic fluctuations influence the extinction time of an epidemic in large populations, using probabilistic theories like the Central Limit Theorem and Moderate Deviations to estimate extinction times.
Contribution
It introduces a novel approach applying Moderate and Large Deviations principles to estimate epidemic extinction times in stochastic models near deterministic equilibria.
Findings
Estimates of epidemic extinction times depend on population size.
Moderate deviations provide precise asymptotic estimates.
Large deviations help understand rare extinction events.
Abstract
Consider an epidemic model with a constant flux of susceptibles, in a situation where the corresponding deterministic epidemic model has a unique stable endemic equilibrium. For the associated stochastic model, whose law of large numbers limit is the deterministic model, the disease free equilibrium is an absorbing state, which is reached soon or later by the process. However, for a large population size, i.e. when the stochastic model is close to its deterministic limit, the time needed for the stochastic perturbations to stop the epidemic may be enormous. In this paper, we discuss how the Central Limit Theorem, Moderate and Large Deviations allow us to give estimates of the extinction time of the epidemic, depending upon the size of the population.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Moderate Deviations and Extinction of an Epidemic
É. Pardoux
Abstract
Consider an epidemic model with a constant flux of susceptibles, in a situation where the corresponding deterministic epidemic model has a unique stable endemic equilibrium. For the associated stochastic model, whose law of large numbers limit is the deterministic model, the disease free equilibrium is an absorbing state, which is reached soon or later by the process. However, for a large population size, i.e. when the stochastic model is close to its deterministic limit, the time needed for the stochastic perturbations to stop the epidemic may be enormous. In this paper, we discuss how the Central Limit Theorem, Moderate and Large Deviations allow us to give estimates of the extinction time of the epidemic, depending upon the size of the population.
1 Introduction
We consider epidemic models where there is a constant flux of susceptible individuals, either because the infected individuals become susceptible immediately after healing, or after some time during which the individual is immune to the illness, or because there is a constant flux of newborn or immigrant susceptibles.
In the above three cases, for certain values of the parameters, there is an endemic equilibrium, which is a stable equilibrium of the associated deterministic epidemic model. The deterministic model can be considered as the Law of Large Numbers limit (as the size of the population tends to ) of a stochastic model, where infections, healings, births and deaths happen according to Poisson processes whose rates depend upon the numbers of individuals in each compartment.
Since the disease free states are absorbing, it follows from an irreducibility property which is clearly valid in our models, that the epidemic will stop soon or later in the more realistic stochastic model. However, the time which the stochastic perturbances will need to stop the epidemic may be enormous when the size of the population is large. The aim of this paper is to describe, based upon the Central Limit Theorem, Large and Moderate Deviations, the time it takes for the epidemic to stop in the stochastic model.
The law of large numbers and central limit theorems are rather old. They can be found e.g. in chapter 11 of Ethier and Kurtz [3]. There are also presented, in the framework of epidemic models, in Britton and Pardoux [1]. The Large Deviations results are close to those presented in Shwartz and Weiss [9], [10], although their assumptions are not quite satisfied in our models. Derivations adapted to our setup can be found in Kratz and Pardoux [5], Pardoux and Samegni–Kepgnou [6], and Britton and Pardoux [1]. The results concerning moderate deviations are new and constitute the core of this paper. Our derivation is essentially based upon an infinite generalization of the Gärtner–Ellis Theorem, Corollary 4.6.14 from Dembo and Zeitouni [2]. Our main results are Theorem 4.10 and Theorem 4.13. We also give expressions for the rate function in our three models of interest, and in case of the simplest model we give an explicit formula for the quasi–potential. We also compare in that case the upper bound of fluctuations given respectively by the central limit theorem, moderate deviations, and large deviations.
The paper is organized as follows. In section 2, we describe the three deterministic and stochastic models which we have in mind, namely the SIS, SIRS and SIR model with demography. In section 3, we give the general formulation of the stochastic models, and recall the Law of Large Numbers, the Central Limit Theorem and the Large Deviations, and their application to the time of extinction of an epidemic. In section 4, we establish the moderate deviations result and explain how it can be used to predict the time taken for an epidemic to cease, depending upon the size of the population. Finally an Appendix establishes an estimate of exponential moments of the integral with respect to a compensated Poisson random measure. This estimate is used several times in our proofs.
In this paper, the same letter denotes an arbitrary constant, whose value may change from line to line.
2 The three models
2.1 The SIS model
The deterministic SIS model is the following. Let (resp. ) denote the proportion of susceptible (resp. infectious) individuals in the population. Given an infection parameter , and a recovery parameter , the deterministic SIS model reads
[TABLE]
Since clearly , the system can be reduced to a one dimensional ODE. If we let , we have ,and we obtain the ODE
[TABLE]
It is easy to verify that this ODE has a so–called “disease free equilibrium”, which is . If , this equilibrium is unstable, and there is an endemic stable equilibrium .
The corresponding stochastic model is as follows. Let (resp. ) denote the proportion of susceptible (resp of infectious) individuals in a population of total size .
[TABLE]
Here and are two mutually independent standard (i.e. rate ) Poisson processes. Let us give some explanations, first concerning the modeling, then concerning the mathematical formulation.
Let (resp. ) denote the number of susceptible (resp. infectious) individuals in the population. The equations for those quantities are the above equations, multiplied by . The argument of reads
[TABLE]
The formulation of such a rate of infections can be explained as follows. Each infectious individual meets other individuals in the population at some rate . The encounter results in a new infection with probability if the partner of the encounter is susceptible, which happens with probability , since we assume that each individual in the population has the same probability of being that partner, and with probability [math] if the partner is an infectious individual. Letting and summing over the infectious individuals at time gives the above rate. Concerning recovery, it is assumed that each infectious individual recovers at rate , independently of the others.
2.2 The SIRS model
In the SIRS model, contrary to the SIS model, an infectious who heals is first immune to the illness, he is “recovered”, and only after some time does he loose his immunity and turn to susceptible. The deterministic SIRS model reads
[TABLE]
while the stochastic SIRS model reads
[TABLE]
These two models could be reduced to two–dimensional models for (resp. for ).
2.3 The SIR model with demography
In this model, recovered individuals remain immune for ever, but there is a flux of susceptibles by births at a given rate multiplied by , while individuals from each of the three compartments die at rate . Thus the deterministic model
[TABLE]
whose stochastic variant reads
[TABLE]
Remark 2.1**.**
One may think that it would be more natural to decide that births happen at rate times the total population. The total population process would be a critical branching process, which would go extinct in finite time a.s., which we do not want. Next it might seem more natural to replace in the infection rate the ratio by , which is the actual ratio of susceptibles in the population at time . It is easy to show that is close to , so we choose the simplest formulation.
Again, we can reduce these models to two–dimensional models for (resp. for ), by deleting the (resp. ) component.
3 The stochastic model, LLN, CLT and LD
3.1 The stochastic model
The three above stochastic models are of the following form.
[TABLE]
where are mutually independent standard Poisson processes, , and . takes its values in .
In the case of the SIS model, , , , , and .
In the case of the SIRS model, , , , , , and , .
In the case of the SIR model with demography, we can restrict ourselves to , while , , , , , , , , .
While the above expressions has the advantage of being concise, we shall rather use the following equivalent formulation of (3.1). Let be mutually independent Poisson random measures on with mean measure the Lebesgue measure, and let , . We can rewrite (3.1) in the form
[TABLE]
The joint law of is the same law of a sequence of random elements of the Skorohod space , whether we use (3.1) or (3.2) for its definition.
Let us state the assumptions which we will need in section 4 below. Those are more than necessary for the results of the present section to hold, see [1] for the proofs.
[TABLE]
Remark 3.1**.**
In practice, in our models, either the process takes its values in a compact subset of (this is the case for all models with a constant population size), or else we restrict ourselves to such a situation, by stopping the process when the total population exceeds a given large value, see section 4.2.7 in [1].
Concerning the initial condition, we assume that for some , , where is the vector whose –th component is the integer part of the real number .
3.2 Law of Large Numbers
We have a Law of Large Numbers
Th eor em 3.2**.**
Let denote the solution of the SDE (3.1). Then a.s. locally uniformly in , where is the unique solution of the ODE
[TABLE]
The main argument in the proof of the above theorem is the fact that, locally uniformly in ,
[TABLE]
3.3 Central Limit Theorem
We also have a Central Limit Theorem. Let .
Th eor em 3.3**.**
As , for the topology of locally uniform convergence, where is a Gaussian process of the form
[TABLE]
where are mutually independent standard Brownian motions.
3.4 Large Deviations, and extinction of an epidemic
We denote by the set of absolutely continuous functions from into . For any , let denote the (possibly empty) set of functions such that a.e. on the set and
[TABLE]
We define the rate function
[TABLE]
where as usual the infimum over an empty set is , and
[TABLE]
with . We assume in the definition of that for all , and . The collection obeys a Large Deviations Principle, in the sense that
Th eor em 3.4**.**
For any open subset ,
[TABLE]
For any closed subset ,
[TABLE]
A slight reinforcement of this theorem allows us to conclude a Wentzell–Freidlin type of result. In what follows, we assume that the first component of (resp. of ) is (resp. ). Assume that the deterministic ODE which appears in Theorem 3.2 has a unique stable equilibrium whose first component satisfies . We define
[TABLE]
Let now
[TABLE]
We have the
Th eor em 3.5**.**
Given any , for any with ,
[TABLE]
Moreover, for all and large enough,
[TABLE]
We refer for the proof of this Theorem to [5] and [1].
It is important to evaluate the quantity . Note that it is the value function of an optimal control problem. In case of the SIS model, which is one dimensional, one can solve this control problem explicitly with the help of Pontryagin’s maximum principle, see [8], and deduce in that case that . For other models, one can compute numerically a good approximation of the value of for each given value of the parameters.
3.5 CLT and extinction of an epidemic
The discussion of this subsection, which motivates the moderate deviations approach of this paper, is taken from section 4.1 in [1]. Consider the SIR with demography.
[TABLE]
We assume that , in which case there is a unique stable endemic equilibrium, namely . We can study the extinction of an epidemic in the above model using the CLT. We note that the basic reproduction number and the expected relative time of a life an individual is infected, , are given by
[TABLE]
The rate of recovery is much larger than the death rate (52 compared to 1/75 for a one week infectious period and 75 year life length) so we use the approximations and . Denote again by the fraction of the population which is infectious in a population of size . The law of large numbers tells us that for and large, is close to . The central limit theorem tell us that converges to a Gaussian process, whose asymptotic variance can be shown to well approximated by . This suggests that for large , the number of infectious individuals in the population is approximately Gaussian, with mean and standard deviation . If and are of the same order, i.e. is of the same order as , it is likely that the fluctuations described by the central limit theorem explain that the epidemic might cease in time of order one. This gives a critical population size roughly of the order of
[TABLE]
in fact probably a bit larger than that.
Consider measles prior to vaccination. In that case it is known that , and we arrive at , which is almost . So, if the population is at most a million (or perhaps a couple of millions), we expect that the disease will go extinct quickly, whereas the disease will become endemic (for a rather long time) in a significantly larger population. This confirms the empirical observation that measles was continuously endemic in UK whereas it died out quickly in Iceland (and was later reintroduced by infectious people visiting the country).
4 Moderate deviations
If the CLT allows to predict extinction of an endemic disease for population sizes under a given threshold , and Large Deviations gives predictions for arbitrarily large population sizes, it is fair to look at Moderate Deviations, which describes ranges of fluctuations between those of the CLT and those of the LD.
The assumptions and are assumed to hold throughout this section.
4.1 The set–up and preliminary estimates
We shall use the general model written in the form (3.2). We assume that the limiting law of large numbers ODE
[TABLE]
has a unique stable equilibrium point such that , called the endemic equilibrium, which is such that, provided , as .
For the sake of simplifying many formulas below, we chance our coordinates, and let . The reader should be aware of the fact that there is a price to pay for that translation of the origin. Indeed, since in the original coordinate system, the process was living on the set of vectors whose coordinates are integer multiples of (this is essential for the process to remain in the set where it makes sense, i.e. for proportions to remain between [math] and ), the new origin generically does not belong to the set of point in which our process may visit. The grid on which lives is translated by the vector , where here and below , denoting the vector whose –th component is the integer part of the –th component of . However, this minor complexity will appear only in the formula for the initial condition of the SDE. Once the SDE starts on the correct grid, the solution remains there.
From now on [math] will be the endemic equilibrium (of course in the translated coordinate system), while will denote that endemic equilibrium in the original coordinates (we shall need it for the formula of the initial condition of the SDE).
We want to study the moderate deviations at scale of , where . Note that would correspond to the large deviations, and to the central limit theorem. We shall need below to consider the ODE starting from a point close to , namely we shall consider the function , solution of the ODE
[TABLE]
where is arbitrary. In fact, we shall be more interested in , which solves (below we exploit the fact that )
[TABLE]
It is not hard to prove that, under our standing assumption that is of class and is bounded, as , uniformly for , where solves the linearized ODE near the endemic equilibrium [math] :
[TABLE]
We want to study the moderate deviations of the process solution of the SDE (3.1) with the initial condition . This amounts to study the large deviations of at speed . We define
[TABLE]
With these notations, the SDE for reads
[TABLE]
If we let , we have
[TABLE]
This combined with Gronwall’s Lemma yields
[TABLE]
From the boundedness and Lipschitz property of , and the formula for , we deduce that
[TABLE]
We deduce from the last three inequalities
[TABLE]
We now define
[TABLE]
so that
[TABLE]
We will see below that the large deviations of will follow from those of by a variant of the contraction principle. We first consider the simpler processes
[TABLE]
which are similar to and , but with replaced by [math].
4.2 The limiting logarithmic moment generating function of
We note that writing the integral over as the sum from to of integrals over , we can rewrite as follows.
[TABLE]
The processes are i.i.d., and their law is that of
[TABLE]
Now let be a vector of signed measures on .
Lemma 4.1**.**
As , (recall that )
[TABLE]
Proof We use in an essential way the above decomposition of .
[TABLE]
provided
[TABLE]
which we will check below. From this it follows that the argument of the logarithm on the before last line is greater than or equal to , at least for large enough, and the final conclusion follows easily from the fact that for any , . Let us now check (4.4). It follows from an exact Taylor formula that
[TABLE]
But is an affine combination of mutually independent Poisson random variables, so that (4.4) follows easily by an explicit computation.
4.3 The limiting logarithmic moment generating function of
We want to study the large deviations of . The main step will be to prove that Lemma 4.1 remains valid if we replace by , which will follow from the next Proposition.
Proposition 4.2**.**
For any , a vector of signed measures, as ,
[TABLE]
Before we establish that Proposition, let us first prove that it yields the wished result.
Proposition 4.3**.**
Given Lemma 4.1, if Proposition 4.2 holds true, then for any signed measure on , as ,
[TABLE]
Proof For any , we deduce from Hölder’s inequality
[TABLE]
so that, if we combine Lemma 4.1 and Proposition 4.2, we deduce that
[TABLE]
and letting , we conclude that
[TABLE]
For the inequality in the other direction, we note that, by similar arguments,
[TABLE]
with , which implies that
[TABLE]
hence, letting we conclude that
[TABLE]
The remaining of this subsection will be devoted to the proof of Proposition 4.2.
We note that Proposition 4.2 is a consequence of the following two Propositions.
Proposition 4.4**.**
For any , as ,
[TABLE]
Proposition 4.5**.**
For any , as ,
[TABLE]
We start with the
Proof of Proposition 4.4 The exponents in the expressions entering (4.5) are sums over the indices and . Using repeatedly Schwartz’s inequality, it is sufficient to prove the results with the sum replaced by each of the summands. Therefore in this proof we do as if , we fix and for the sake of simplifying the notations, we drop the index . We note that
[TABLE]
It is not hard to see that one can treat each of the two terms on the right separately, and we treat only the first term, the treatment of the second one being quite similar. We note that there exists a compensated standard Poisson process on such that the factor of in this first term can be rewritten as
[TABLE]
We need to estimate . If we decompose the signed measure as the difference of two measures as follows , we again have two terms, and it suffices to treat one of them, say . Of course it suffices to treat the case where . Since the positive constant is arbitrary, we can w.l.o.g. assume that is a probability measure on . It is then clear that
[TABLE]
We choose a new parameter , and we write the expression whose expectation needs to be estimated as a sum of two terms as follows.
[TABLE]
We now estimate the first term on the right hand side of (4.6). For that sake, we define the stopping time
[TABLE]
and note that
[TABLE]
Consequently the expectation of the first term on the right of (4.6) is bounded from above by
[TABLE]
where the first inequality follows from Proposition 5.1 in the Appendix below, and the second one exploits the Lipschitz property of . Consider now the second term on the right hand side of (4.6).
[TABLE]
for some , where the second inequality follows from Proposition 5.1 and the boundedness of . Estimating the second factor in the last expression amounts to estimating the two probabilities (with another )
[TABLE]
We estimate the first probability. For any ,
[TABLE]
where the second inequality follows from Proposition 5.1 and the last inequality by optimizing over . One can easily convince oneself that a similar result holds for the second line of (4.7), making use of Proposition 5.1 with a negative . Note also for further use that the same result also holds in case . In that case, the probability on the second line of (4.7) is zero for large enough , in which case the anounced estimate is of course true.
The expectation of the second term of the right hand side of (4.6) is thus dominated by (with and two positive constants)
[TABLE]
Finally
[TABLE]
It follows readily from the inequality that for large enough
[TABLE]
which establishes (4.5).
We now turn to the second proof.
Proof of Proposition 4.5 Recalling assumption , we now define, with ,
[TABLE]
the event
[TABLE]
and the stopping time
[TABLE]
where the constant will be chosen below, and the constant is arbitrary. From the estimate (4.1),
[TABLE]
We take the limit successively in the two terms of the above right hand side. Step 1 : Estimate of (4.9) We have
[TABLE]
We first note that the arguments used in the proof of (4.8), in the particular case , yield
[TABLE]
for some constant . We next estimate the product
[TABLE]
For the same reason as in the previous proof, we need only consider the case . It follows from Proposition 5.1 that the first factor satisfies
[TABLE]
Finally there exist two positive constants and such that
[TABLE]
for large enough. So of the above tends to [math], as .
Step 2 : Estimate of (4.10) We first note that
[TABLE]
The first term on the right tends to [math] as . It remains to take care of the second term. Since is a martingale, it is clear that the process
[TABLE]
is a submartingale. Consequently, from Doob’s submartingale inequality,
[TABLE]
Next
[TABLE]
Consider first the first factor on the right hand side of (4.3). We deduce from the definition of that
[TABLE]
with . Consequently the square of the first factor on the right of (4.3) is bounded from above by
[TABLE]
where we have used Doob’s optional sampling theorem for submartingales. From the same argument as above,we do as if , note that
[TABLE]
and exploit Proposition 4.4 in order to conclude concerning of the first factor on the right of (4.3).
We next note that
[TABLE]
Hence the square of the second term on the right of (4.3) satisfies
[TABLE]
Consider first the second factor on the right of (4.13). We have
[TABLE]
Using the Cauchy–Schwartz inequality several times, it is clear that it is sufficient to do as if we had (dropping the index )
[TABLE]
with and , where . We now choose . We have
[TABLE]
We have proved that the second factor on the right of (4.13) remains bounded, as . We next consider the first factor on the right of (4.13). We first note that
[TABLE]
But from (4.3), .
It follows that the left hand side of (4.13) is bounded from above by a constant times
[TABLE]
where and are two positive constants. This last expression is bounded by , as soon as is large enough. Finally of the left-hand side of (4.13) tends to [math], as .
Remark 4.6**.**
We note that the full strength of (4.1) is necessary for the proof of Proposition 4.5. Indeed, while certainly does not converge to [math] as , clearly with high probability is smaller than , but .
4.4 Large deviations of
We first define the Fenchel–Legendre transform of
[TABLE]
where has been defined by (4.3), is a vector of signed measures and , being the –th coordinate of the vector . We have exploited the fact that is the sum over of zero mean mutually independent random variables. For each , we define
[TABLE]
The next step will consist in proving that the sequence of processes satisfies a Large Deviation Principle.
Th eor em 4.7**.**
The sequence satisfies the Large Deviation Principle in equipped with the supnorm topology, with the convex, good rate function and with speed , in the sense that for any Borel subset ,
[TABLE]
Since there is a difficulty with having a topology on which makes it a topological vector space, and allows for a simple characterization of the class of compact sets, we shall use a small detour for the proof of the above Theorem. Recall that
[TABLE]
where is piecewise constant, with jumps of size . Let denote the continuous piecewise linear approximation of , which is defined as follows. Let denote the successive jump times of the process . For , on the interval ,
[TABLE]
Next we define
[TABLE]
We note that
[TABLE]
hence for any , for large enough,
[TABLE]
This implies clearly
Lemma 4.8**.**
The two sequences and \Big{\{}\widetilde{\widetilde{Y}}^{N,\alpha}\Big{\}}_{N\geq 1} are exponentially equivalent in , equipped with the supnorm topology, in the sense that for each ,
[TABLE]
We shall prove below the following.
Proposition 4.9**.**
The sequence \Big{\{}\widetilde{\widetilde{Y}}^{N,\alpha}\Big{\}}_{N\geq 1} is exponentially tight in , the space of continuous functions from into , which start from [math] at , in the sense that for any , there exists a compact subset such that
[TABLE]
Let us now turn to the proof of the above Theorem.
Proof of Theorem 4.7 From (4.14), we deduce that
[TABLE]
as . Consequently, again by the argument of Proposition 4.3, we deduce from that same Proposition that for any signed measure on , as ,
[TABLE]
This, together with Proposition 4.9, allows us to apply Corollary 4.6.14 from [2], to conclude that the sequence \Big{\{}\widetilde{\widetilde{Y}}^{N,\alpha}\Big{\}}_{N\geq 1} satisfies a LDP in with the good rate function , and speed . Since is closed in equipped with the supnorm topology, it follows from Lemma 4.1.5 in [2] that the same LDP holds in the latter space, with the same rate function , extended to that space by for . The result now follows from Lemma 4.8, in view of Theorem 4.2.13 from [2].
We now turn to the
Proof of Proposition 4.9 Clearly it suffices to prove both that
[TABLE]
and that the sequence is exponentially tight in . Let us first establish (4.15). It follows from (4.1) that
[TABLE]
Consequently, if , with ,
[TABLE]
It follows from Doob’s submartingale inequality and a combination of Lemma 4.1 and Proposition 4.4 that the as of the second term of the last right hand side is finite. (4.15) clearly follows.
It remains to consider . Define the modulus of continuity of an element as . It follows from Ascoli’s theorem that for any sequence of positive numbers, the following is a compact subset of :
[TABLE]
Suppose that for each , , we can find such that for all ,
[TABLE]
From this we deduce that
[TABLE]
so that
[TABLE]
from which the result follows. A sufficient condition for (4.16) to be true is that for any ,
[TABLE]
In turn a sufficient condition for this is that
[TABLE]
which we now prove. It is not hard to see that
[TABLE]
where we have used Doob’s submartingale inequality at the last step. Clearly
[TABLE]
Using repeatedly Cauchy–Schwartz’s inequality, we see that it suffices to estimate for each
[TABLE]
where , we have used Proposition 5.1 and the inequality , valid for , which we have applied with and (recall that we will first let ). Putting together the last estimates yields
[TABLE]
(4.17) follows, and the Proposition is proved.
4.5 Computation of the rate function
Let us compute in the three examples which we discussed above in section 2. Here we do not translate to the origin.
4.5.1 Computation of for the SIS model
Recall that in this case , , , , , . If , there is a unique stable endemic equilibrium . We first compute
[TABLE]
where
[TABLE]
It is easy to check that , where
[TABLE]
Consequently
[TABLE]
We now need to compute in case . We should take the supremum over the signed measures on of the quantity
[TABLE]
The supremum is achieved at the signed measure which makes the gradient with respect to of the above zero, if any. We first note that for such a to exist, we need that , unless . Now the optimal must satisfy
[TABLE]
So necessarily
[TABLE]
Substituting this signed measure in the above formula, we obtain that
[TABLE]
Consequently
[TABLE]
4.5.2 Computation of for the SIRS model
In this model, and . We have , , , , and , . In the case , there is a unique stable endemic equilibrium, namely . In order to simplify the notations, we shall write , , and . We have
[TABLE]
The functional to be maximized with respect to if
[TABLE]
Writing that the gradient w.r.t. and of this functional is zero leads to the identities
[TABLE]
This implies the identities
[TABLE]
Finally we deduce that is unless is absolutely continuous and , in which case
[TABLE]
4.5.3 Computation of for the SIR model with demography
In this case, , , , , , , , and , . In the case , there is a unique stable endemic equilibrium, namely . We shall use the notations , , and We have
[TABLE]
Formally the functional has exactly the same form as in the case of the SIRS model, only the constants have different values. The same computations as in the previous subsection lead to the same result, namely that is unless is absolutely continuous and , in which case
[TABLE]
4.6 Moderate deviations of
We again equip with the supnorm topology. Let for be the continuous map which to associates solution of the ODE
[TABLE]
and for each be the continuous map which to associates solution of the ODE
[TABLE]
We have
[TABLE]
which converges to [math] as , uniformly in and . We want to study the moderate deviations of , or in other words the large deviations of . In what follows, we shall denote by the process starting from . From (4.2), , hence the following statement is a consequence of Theorem 4.7, (4.18) and Corollary 4.2.21 from [2].
Th eor em 4.10**.**
Assume that and hold. The collection of processes satisfies a large deviations principle with speed and the good rate function
[TABLE]
More precisely, for any Borel subset ,
[TABLE]
Since the mapping has the nice property that , it follows readily again from Corollary 4.2.21 in [2] that the above result can be extended to the following statement.
Th eor em 4.11**.**
Assume that and hold. For any closed set , for any sequence ,
[TABLE]
For any open set , for any sequence ,
[TABLE]
From this last Theorem, we can deduce, with the same proof as that of Corollary 5.6.15 in [2], the following Corollary.
Corollary 4.12**.**
Assume that and hold. Let denote an arbitrary compact subset of .
For any closed set ,
[TABLE]
For any open set ,
[TABLE]
4.7 Wentzell–Freidlin theory and extinction of an epidemic
We now define
[TABLE]
where , and we recall that we have translated the endemic equilibrium at the origin.
We can now state our main result.
Th eor em 4.13**.**
Assume that and hold. For some , let , where denotes the first coordinate of the process . The following hold.
For any such that , and any ,
[TABLE]
and
[TABLE]
Given Corollary 4.12, the proof of the above result follows the exact same steps as that of Theorem 5.7.11 in [2], with some minor modifications, to adapt to the fact that our processes have discontinuous trajectories, see the proof of Theorem 7.14 in [5], or of Theorem 4.2.17 in [1].
Recall that . In the CLT regime, , , while in the LD regime, , .
4.7.1 Interpretation. The critical population size
Going back to the original coordinates, i.e. , we should interpret as . So (dropping the index for the starting point in order to simplify our notations), is the first time when . For to be finite, we need to have , since cannot become negative. This is of course no problem for the limit theorem, since as , while is fixed. However, a deviation of the order of is enough for to hit zero, if is of the order of , which means that is of the order of . is the order of magnitude of the time needed for to make a deviation of size . This is sufficient to extinguish an epidemic, provided is of the same order, so that the corresponding critical size is , which is roughly the CLT critical population size raised to the power .
4.7.2 The value of in the SIS model
In the particular case of the SIS model, we can compute explicitly the value of the quasi–potential . In this case, , the linearized ODE around the endemic equilibrium translated at [math] reads
[TABLE]
and the cost functional to minimize is
[TABLE]
We are looking for the minimal cost for driving from [math] to . We now exploit the Pontryagin maximum principle, see [8]. The Hamiltonian reads
[TABLE]
The optimal control must maximize the Hamiltonian, so it satisfies . Since the final time is free and the system is autonomous, the Hamiltonian vanishes along the optimal trajectory, so that along such a trajectory, either , in which case , or else , hence . Finally the pieces of optimal trajectory which move towards the origin correspond to , those which move away from the origin (this is the case we are interested in) satisfy the time reversed ODE . There is no optimal trajectory from to . However, if we start from , the optimal trajectory is , so , the final state is reached at time , and the optimal cost is . A possible sub–optimal control starting from [math] is as follows. Choose for a time of order , until reaches , whose cost is of the order of , and then choose the optimal feedback, until is reached. Letting , the total cost converges to
[TABLE]
4.8 Comparison between the CLT, MD and LD
We do that comparison in case of the SIS model, for which we have explicit expressions for the rate functions and the quasi–potentials. We still translate at the origin, and start our process at the origin : . To make a change with the above, we fix and want to compare (for large) the upper bounds for in the three cases (the central limit theorem), (moderate deviations) and (large deviations).
We start with the central limit theorem. It is easy to see that solves the SDE
[TABLE]
so that the asymptotic variance of is . Consequently for fixed and any , there exist and large enough such that we have the following upper bound for the probability of a positive deviation of
[TABLE]
This bound follows from the following estimate, valid for a random variable : after optimizing over .
Consider next the moderate deviations. Theorem 4.10 combined with the computation from the last subsection indicates that for , any , there exists and large enough such that
[TABLE]
We finally consider the large deviations. Here we need to assume that . We exploit the computations from sections 4.2.6 and A.6 in [1]. The optimal trajectory to go from to is the original ODE, but time reversed, i.e. it follows the ODE . The running cost is , so the total cost is
[TABLE]
Consequently, from Theorem 3.4, for any , there exists and large enough such that
[TABLE]
We note that Moderate Deviations resembles much more the Central Limit Theorem than Large Deviations. The fact that the discontinuity in the form of the rate function is exactly at is typical of random variables with light tails. The situation would be quite different with heavy tails, see e. g. section VIII.4 in Petrov [7].
Note however that for small ,
[TABLE]
which is not too surprising, and in a sense reconciles Large Deviations and Moderate Deviations. Were our driving noises Brownian, then the LD rate function would be quadratic as that of MD, but the LD quasi–potential is the minimal cost when controlling the LLN ODE, while the MD quasi–potential is the minimal cost when controlling the linearized ODE around the endemic equilibrium.
5 Appendix
In this Appendix, we establish the following technical result.
Proposition 5.1**.**
Let be a standard Poisson random mesure on , and the associated compensated measure. If is an –valued predictable process such that has exponential moments of any order, and , then for any ,
[TABLE]
Proof Consider with the process
[TABLE]
It follows from Itô’s formula that
[TABLE]
From Lemma 5.2 below, is a martingale. Hence is a martingale if , a submartingale if we replace by , and a supermartingale if we replace by . Consequently if , . Now, using first Doob’s inequality for submartingales, and later Schwartz’s inequality, we have
[TABLE]
If , it follows from the previous argument that the first factor on the second right hand side is less than or equal to , hence the result follows.
In order to complete the proof of Proposition 5.1, we still need to establish
Lemma 5.2**.**
The process satisfying the same assumptions as in Proposition 5.1, and being given by (5.1), is a martingale.
Proof It is plain that is a local martingale, whose predictable quadratic variation is given as
[TABLE]
All we need to show is that the above quantity is integrable. It is clearly a consequence of the assumption in case . In case , the second factor of the right hand side has finite exponential moments, so is square integrable, and all we need to show is that
[TABLE]
Using Itô’s formula we have
[TABLE]
The same computation with replaced by , and then replaced by would show that is a martingale satisfying . But a.s., hence Fatou’s Lemma implies that . Since
[TABLE]
it follows from Schwartz’s inequality that
[TABLE]
and the result follows from our assumption on .
Acknowledgement
It is a pleasure to thank Pierre Petit for an inspiring discussion on moderate deviations.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Tom Britton and Etienne Pardoux, Stochastic Epidemics in a Homogeneous Community, ar Xiv:1808.0535, submitted.
- 2[2] Amir Dembo and Ofer Zeitouni, Large Deviations, Techniques and Applications , 2d ed., Applications of Mathematics 38 , Springer, New York, 1998.
- 3[3] Stewart N. Ethier and Thomas G. Kurtz, Markov processes. Characterization and convergence , J. Wiley 1986.
- 4[4] Mark I. Freidlin and Alexander D. Wentzell. Random perturbations of dynamical systems , 3d ed. Grundlehren des Mathematischen Wissenschaften 260 , Springer, New York, 2012.
- 5[5] Peter Kratz and Etienne Pardoux, Large deviations for infectious diseases models, in Séminaire de Probabilités XLIX , C. Donati-Martin, A. Lejay, A. Rouault eds., Lecture Notes in Math. 2215 , pp. 221-327, 2018.
- 6[6] Etienne Pardoux and Brice Samegni–Kepgnou, Large deviation principle for epidemic models, Journal of Applied Probability 54 , 905–920, 2017.
- 7[7] Vasily V. Petrov Sums of Independent Random Variables , Ergebnisse der Mathematik und ihrer Grenzgebiete 82 , Springer Verlag, 1975.
- 8[8] Lev S. Pontryagin, Vladimir G. Boltyanskii, Revaz V. Gamkrelidze and Evgenii F. Mishchenko The mathematical theory of optimal processes . Transl. by K. N. Trirogoff; ed. by L. W. Neustadt, John Wiley & Sons, 1962.
