Large deviations for i.i.d. replications of the total progeny of a   Galton--Watson process

Claudio Macci; Barbara Pacchiarotti

arXiv:1704.02100·math.PR·April 10, 2017

Large deviations for i.i.d. replications of the total progeny of a Galton--Watson process

Claudio Macci, Barbara Pacchiarotti

PDF

TL;DR

This paper explores large deviation principles for the total progeny in Galton--Watson processes, including cases with random initial populations and estimators of offspring mean, linking branching process theory with large deviation techniques.

Contribution

It introduces large deviation results for total progeny distributions in Galton--Watson processes, including new insights for random initial populations and estimator sequences.

Findings

01

Large deviation rate functions for total progeny are characterized.

02

Results extend to processes with random initial populations.

03

Estimates of offspring mean exhibit specific large deviation behaviors.

Abstract

The Galton--Watson process is the simplest example of a branching process. The relationship between the offspring distribution, and, when the extinction occurs almost surely, the distribution of the total progeny is well known. In this paper, we illustrate the relationship between these two distributions when we consider the large deviation rate function (provided by Cram\'{e}r's theorem) for empirical means of i.i.d. random variables. We also consider the case with a random initial population. In the final part, we present large deviation results for sequences of estimators of the offspring mean based on i.i.d. replications of total progeny.

Equations102

V_{n} := k = 1 \sum V_{n - 1} X_{n, k} (\mbox f or n \geq 1),

V_{n} := k = 1 \sum V_{n - 1} X_{n, k} (\mbox f or n \geq 1),

\bigl{\{}V_{n}^{f,g}:n\geq 0\bigr{\}}

\bigl{\{}V_{n}^{f,g}:n\geq 0\bigr{\}}

q_{r}:=P\bigl{(}V_{0}^{f,g}=r\bigr{)}\quad(\mbox{for all integer}\ r\geq 0);

q_{r}:=P\bigl{(}V_{0}^{f,g}=r\bigr{)}\quad(\mbox{for all integer}\ r\geq 0);

V_{n}^{f, g} := i = 1 \sum V_{n - 1}^{f, g} X_{n, i} (\mbox f or a l l n \geq 1) .

V_{n}^{f, g} := i = 1 \sum V_{n - 1}^{f, g} X_{n, i} (\mbox f or a l l n \geq 1) .

p_{\mathrm{ext}}^{f,g}:=P\bigl{(}\bigl{\{}V_{n}^{f,g}=0\ \mbox{for some}\ n\geq 0\bigr{\}}\bigr{)},

p_{\mathrm{ext}}^{f,g}:=P\bigl{(}\bigl{\{}V_{n}^{f,g}=0\ \mbox{for some}\ n\geq 0\bigr{\}}\bigr{)},

p_{\mathrm{ext}}^{f,\mathrm{id}}=\min\bigl{\{}s\in[0,1]:f(s)=s\bigr{\}};

p_{\mathrm{ext}}^{f,\mathrm{id}}=\min\bigl{\{}s\in[0,1]:f(s)=s\bigr{\}};

p_{\mathrm{ext}}^{f,g}:=q_{0}+\sum_{n\geq 1}\bigl{(}p_{\mathrm{ext}}^{f,\mathrm{id}}\bigr{)}^{n}q_{n}=g\bigl{(}p_{\mathrm{ext}}^{f,\mathrm{id}}\bigr{)},

p_{\mathrm{ext}}^{f,g}:=q_{0}+\sum_{n\geq 1}\bigl{(}p_{\mathrm{ext}}^{f,\mathrm{id}}\bigr{)}^{n}q_{n}=g\bigl{(}p_{\mathrm{ext}}^{f,\mathrm{id}}\bigr{)},

\begin{array}[]{ll}p_{\mathrm{ext}}^{f,g}=g(0)=q_{0}&\ \mbox{if}\ p_{0}=0;\\ p_{\mathrm{ext}}^{f,g}=g(1)=1&\ \mbox{if}\ p_{0}>0\ \mbox{and}\ \mu_{f}\leq 1;\\ p_{\mathrm{ext}}^{f,g}\in(q_{0},1)&\ \mbox{if}\ p_{0}>0\ \mbox{and}\ \mu_{f}>1.\end{array}

\begin{array}[]{ll}p_{\mathrm{ext}}^{f,g}=g(0)=q_{0}&\ \mbox{if}\ p_{0}=0;\\ p_{\mathrm{ext}}^{f,g}=g(1)=1&\ \mbox{if}\ p_{0}>0\ \mbox{and}\ \mu_{f}\leq 1;\\ p_{\mathrm{ext}}^{f,g}\in(q_{0},1)&\ \mbox{if}\ p_{0}>0\ \mbox{and}\ \mu_{f}>1.\end{array}

Y^{f,g}:=\sum_{i=0}^{\tau-1}V_{i}^{f,g},\quad\mbox{where}\ \tau:=\inf\bigl{\{}n\geq 0:V_{n}^{f,g}=0\bigr{\}},

Y^{f,g}:=\sum_{i=0}^{\tau-1}V_{i}^{f,g},\quad\mbox{where}\ \tau:=\inf\bigl{\{}n\geq 0:V_{n}^{f,g}=0\bigr{\}},

G_{f, g} (s) := k \geq 0 \sum s^{k} π_{k}^{f, g},

G_{f, g} (s) := k \geq 0 \sum s^{k} π_{k}^{f, g},

ν^{f, g} := k \geq 0 \sum k π_{k}^{f, g}, \mbox an d w e ha v e ν^{f, g} = \frac{μ _{g}}{1 - μ _{f}};

ν^{f, g} := k \geq 0 \sum k π_{k}^{f, g}, \mbox an d w e ha v e ν^{f, g} = \frac{μ _{g}}{1 - μ _{f}};

\nu^{f,g}=\left\{\begin{array}[]{@{}ll}\infty&\ \mbox{if}\ \mu_{g}>0\ (\mbox{and}\ \mu_{f}=1)\\ 0&\ \mbox{if}\ \mu_{g}=0\ (\mbox{and}\ \mu_{f}=1).\end{array}\right.

\nu^{f,g}=\left\{\begin{array}[]{@{}ll}\infty&\ \mbox{if}\ \mu_{g}>0\ (\mbox{and}\ \mu_{f}=1)\\ 0&\ \mbox{if}\ \mu_{g}=0\ (\mbox{and}\ \mu_{f}=1).\end{array}\right.

π_{k}^{f, id} = \frac{1}{k} \cdot p_{k - 1}^{* k},

π_{k}^{f, id} = \frac{1}{k} \cdot p_{k - 1}^{* k},

\mathcal{G}_{f,\mathrm{id}}(s)=sf\bigl{(}\mathcal{G}_{f,\mathrm{id}}(s)\bigr{)}.

\mathcal{G}_{f,\mathrm{id}}(s)=sf\bigl{(}\mathcal{G}_{f,\mathrm{id}}(s)\bigr{)}.

n \to \infty lim inf \frac{1}{n} lo g P (W_{n} \in O) \geq - w \in O in f I (w) \mbox f or a l l o p e n se t s O,

n \to \infty lim inf \frac{1}{n} lo g P (W_{n} \in O) \geq - w \in O in f I (w) \mbox f or a l l o p e n se t s O,

n \to \infty lim sup \frac{1}{n} lo g P (W_{n} \in C) \leq - w \in C in f I (w) \mbox f or a l l c l ose d se t s C .

n \to \infty lim sup \frac{1}{n} lo g P (W_{n} \in C) \leq - w \in C in f I (w) \mbox f or a l l c l ose d se t s C .

I(w):=\sup_{\theta\in\mathbb{R}}\bigl{\{}\theta w-\log\mathbb{E}\bigl{[}e^{\theta W_{1}}\bigr{]}\bigr{\}}.

I(w):=\sup_{\theta\in\mathbb{R}}\bigl{\{}\theta w-\log\mathbb{E}\bigl{[}e^{\theta W_{1}}\bigr{]}\bigr{\}}.

I(w):=\sup_{\theta\in\mathbb{R}^{d}}\bigl{\{}\langle\theta,w\rangle-\log\mathbb{E}\bigl{[}e^{\langle\theta,W_{1}\rangle}\bigr{]}\bigr{\}}.

I(w):=\sup_{\theta\in\mathbb{R}^{d}}\bigl{\{}\langle\theta,w\rangle-\log\mathbb{E}\bigl{[}e^{\langle\theta,W_{1}\rangle}\bigr{]}\bigr{\}}.

I_{f}(x):=\sup_{\alpha\in\mathcal{D}(f)}\bigl{\{}\alpha x-\log f\bigl{(}e^{\alpha}\bigr{)}\bigr{\}},

I_{f}(x):=\sup_{\alpha\in\mathcal{D}(f)}\bigl{\{}\alpha x-\log f\bigl{(}e^{\alpha}\bigr{)}\bigr{\}},

I_{\mathcal{G}_{f,\mathrm{id}}}(x):=\sup_{\beta\in\mathcal{D}(\mathcal{G}_{f,\mathrm{id}})}\bigl{\{}\beta y-\log\mathcal{G}_{f,\mathrm{id}}\bigl{(}e^{\beta}\bigr{)}\bigr{\}},

I_{\mathcal{G}_{f,\mathrm{id}}}(x):=\sup_{\beta\in\mathcal{D}(\mathcal{G}_{f,\mathrm{id}})}\bigl{\{}\beta y-\log\mathcal{G}_{f,\mathrm{id}}\bigl{(}e^{\beta}\bigr{)}\bigr{\}},

\alpha(\beta):=\log\mathcal{G}_{f,\mathrm{id}}\bigl{(}e^{\beta}\bigr{)}

\alpha(\beta):=\log\mathcal{G}_{f,\mathrm{id}}\bigl{(}e^{\beta}\bigr{)}

\beta(\alpha):=\log\mathcal{G}_{f,\mathrm{id}}^{-1}\bigl{(}e^{\alpha}\bigr{)}

\beta(\alpha):=\log\mathcal{G}_{f,\mathrm{id}}^{-1}\bigl{(}e^{\alpha}\bigr{)}

I_{f}(x)=\sup_{\beta\in\mathcal{D}(\mathcal{G}_{f,\mathrm{id}})}\bigl{\{}\log\mathcal{G}_{f,\mathrm{id}}\bigl{(}e^{\beta}\bigr{)}x-\log f\bigl{(}\mathcal{G}_{f,\mathrm{id}}\bigl{(}e^{\beta}\bigr{)}\bigr{)}\bigr{\}}.

I_{f}(x)=\sup_{\beta\in\mathcal{D}(\mathcal{G}_{f,\mathrm{id}})}\bigl{\{}\log\mathcal{G}_{f,\mathrm{id}}\bigl{(}e^{\beta}\bigr{)}x-\log f\bigl{(}\mathcal{G}_{f,\mathrm{id}}\bigl{(}e^{\beta}\bigr{)}\bigr{)}\bigr{\}}.

I_{f}(x)=(1-x)I_{\mathcal{G}_{f,\mathrm{id}}}\biggl{(}\frac{1}{1-x}\biggr{)}.

I_{f}(x)=(1-x)I_{\mathcal{G}_{f,\mathrm{id}}}\biggl{(}\frac{1}{1-x}\biggr{)}.

I_{g}(z):=\sup_{\gamma\in\mathbb{R}}\bigl{\{}\gamma z-\log g\bigl{(}e^{\gamma}\bigr{)}\bigr{\}}.

I_{g}(z):=\sup_{\gamma\in\mathbb{R}}\bigl{\{}\gamma z-\log g\bigl{(}e^{\gamma}\bigr{)}\bigr{\}}.

I_{\mathcal{G}_{f,g},g}(y,z)=\left\{\begin{array}[]{@{}ll}yI_{f}(\frac{y-z}{y})+I_{g}(z)&\ \mbox{if}\ y\geq z>0,\\ I_{g}(0)&\ \mbox{if}\ y=z=0,\\ \infty&\ \mbox{otherwise}.\end{array}\right.

I_{\mathcal{G}_{f,g},g}(y,z)=\left\{\begin{array}[]{@{}ll}yI_{f}(\frac{y-z}{y})+I_{g}(z)&\ \mbox{if}\ y\geq z>0,\\ I_{g}(0)&\ \mbox{if}\ y=z=0,\\ \infty&\ \mbox{otherwise}.\end{array}\right.

I_{\mathcal{G}_{f,g},g}(y,z):=\sup_{\beta,\gamma\in\mathbb{R}}\bigl{\{}\beta y+\gamma z-\log\mathbb{E}\bigl{[}e^{\beta Y^{f,g}+\gamma V_{0}^{f,g}}\bigr{]}\bigr{\}}.

I_{\mathcal{G}_{f,g},g}(y,z):=\sup_{\beta,\gamma\in\mathbb{R}}\bigl{\{}\beta y+\gamma z-\log\mathbb{E}\bigl{[}e^{\beta Y^{f,g}+\gamma V_{0}^{f,g}}\bigr{]}\bigr{\}}.

\mathbb{E}\bigl{[}e^{\beta Y^{f,g}+\gamma V_{0}^{f,g}}\bigr{]}=\mathbb{E}\bigl{[}e^{\gamma V_{0}^{f,g}}\bigl{(}\mathcal{G}_{f,\mathrm{id}}\bigl{(}e^{\beta}\bigr{)}\bigr{)}^{V_{0}^{f,g}}\bigr{]}=g\bigl{(}e^{\gamma}\mathcal{G}_{f,\mathrm{id}}\bigl{(}e^{\beta}\bigr{)}\bigr{)};

\mathbb{E}\bigl{[}e^{\beta Y^{f,g}+\gamma V_{0}^{f,g}}\bigr{]}=\mathbb{E}\bigl{[}e^{\gamma V_{0}^{f,g}}\bigl{(}\mathcal{G}_{f,\mathrm{id}}\bigl{(}e^{\beta}\bigr{)}\bigr{)}^{V_{0}^{f,g}}\bigr{]}=g\bigl{(}e^{\gamma}\mathcal{G}_{f,\mathrm{id}}\bigl{(}e^{\beta}\bigr{)}\bigr{)};

I_{\mathcal{G}_{f,g},g}(y,z)=\sup_{\beta,\gamma\in\mathbb{R}}\bigl{\{}\beta y+\gamma z-\log g\bigl{(}e^{\gamma+\log\mathcal{G}_{f,\mathrm{id}}(e^{\beta})}\bigr{)}\bigr{\}}.

I_{\mathcal{G}_{f,g},g}(y,z)=\sup_{\beta,\gamma\in\mathbb{R}}\bigl{\{}\beta y+\gamma z-\log g\bigl{(}e^{\gamma+\log\mathcal{G}_{f,\mathrm{id}}(e^{\beta})}\bigr{)}\bigr{\}}.

(\beta,\gamma)\mapsto\bigl{(}\beta,\gamma+\log\mathcal{G}_{f,\mathrm{id}}\bigl{(}e^{\beta}\bigr{)}\bigr{)}

(\beta,\gamma)\mapsto\bigl{(}\beta,\gamma+\log\mathcal{G}_{f,\mathrm{id}}\bigl{(}e^{\beta}\bigr{)}\bigr{)}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Large deviations for i.i.d. replications of the total progeny

of a Galton–Watson process

C.\fnmClaudioMacci

[email protected]

Dipartimento di Matematica, Università di Roma Tor Vergata,

Via della Ricerca Scientifica, I-00133 Rome, Italy

B.\fnmBarbaraPacchiarotti

[email protected]

(2017; 27 September 2016; 15 December 2016; 17 December 2016)

Abstract

The Galton–Watson process is the simplest example of a branching process. The relationship between the offspring distribution, and, when the extinction occurs almost surely, the distribution of the total progeny is well known. In this paper, we illustrate the relationship between these two distributions when we consider the large deviation rate function (provided by Cramér’s theorem) for empirical means of i.i.d. random variables. We also consider the case with a random initial population. In the final part, we present large deviation results for sequences of estimators of the offspring mean based on i.i.d. replications of total progeny.

Cramér’s theorem,

initial random population,

estimators of offspring mean,

60F10,

60J80,

62F10,

62F12,

doi:

10.15559/16-VMSTA72

keywords:

[2010]

††volume: 4††issue: 1

\startlocaldefs\urlstyle

rm \allowdisplaybreaks \endlocaldefs

\cortext

[cor1]Corresponding author.

\publishedonline

11 January 2017

1 Introduction

There is a vast literature on branching processes. Here we cite the monographs [1, 3, 12]; moreover, we also cite the monographs [18] for the multitype case, [10], which focuses on statistical inference, and [13] and [15] for applications in biology.

The simplest example of a branching process is the Galton–Watson process. We consider the case of a population that has a unique individual at the beginning and all the individuals (of all generations) live for a unitary time; moreover, at the end of their lifetimes, every individual of the population (of every generation) produces a random number of new individuals acting independently of all the rest, according to a specific fixed distribution. So, if we consider a sequence of random variables $\{V_{n}:n\geq 0\}$ such that $V_{n}$ is the population size at time $n$ (for all $n\geq 0$ ), we have $V_{0}=1$ and

[TABLE]

where $\{X_{n,i}:n,i\geq 1\}$ is a family of nonnegative integer-valued i.i.d. random variables. In other words, $X_{n,1},\ldots,X_{n,V_{n-1}}$ represent the offspring generated at time $n$ by each of $V_{n-1}$ individuals that live at time $n-1$ . We recall some other preliminaries on the Galton–Watson process in Section 2, where, in particular, we consider a slightly different notation to allow the case with a random initial population (instead of the case with a unitary initial population cited before).

In this paper, we present large deviation results. The theory of large deviations is a collection of techniques that gives an asymptotic estimate of small probabilities in an exponential scale (see, e.g., [6] as a reference). We recall some preliminaries in Section 2. The literature on large deviations for branching processes is large. Here we essentially recall some references with results concerning the Galton–Watson process.

In several references, the large-time behavior for the supercritical case is studied, namely the case where the offspring mean $\mu$ is strictly larger than one (in such a case, the extinction probability is strictly less than one). Here we recall [2] (see also [4] for the multitype case), [5], where the main object is the study of the tails of $W:=\lim_{n\to\infty}V_{n}/\mu^{n}$ , [19] with a careful analysis based on harmonic moments of $\{V_{n}:n\geq 0\}$ , [20] (and [21]) with some conditional large deviation results based on some local limit theorems, [8] where the central role of some “lower deviation probabilities” is highlighted for the study of the asymptotic behavior of the Lotka–Nagaev estimator $V_{n+1}/V_{n}$ of $\mu$ .

Other references study the most likely paths to extinction at some time $n_{0}$ when the initial population $k$ is large. The idea is to consider the representation of a branching process with initial population equal to $k$ as a sum of $k$ i.i.d. replications of the process with a unitary initial population; in this case, Cramér’s theorem for empirical means of i.i.d. random variables (on $R^{n_{0}}$ ) plays a crucial role. A most likely path to extinction in [16] (see also [17]) is a trajectory that minimizes the rate function among the paths that reach the level 0 at time $n_{0}$ . A generalization of this concept for the most likely paths to reach a level $b\geq 0$ can be found in [11].

In this paper, we are interested in a different direction. Namely, we are interested in the empirical means of i.i.d. replications of the total progeny of a Galton–Watson process. The total progenies of branching processes are studied in several references: here we cite the old references [7, 14, 22] for a Galton–Watson process, and [9] (see Section 2.2) among the references concerning different branching processes. The total progeny of a Galton–Watson process is an almost surely finite random variable when the extinction occurs almost surely, and therefore the supercritical case will not be considered. Some relationships between the offspring distribution and the total progeny distribution of a Galton–Watson process are well known (see \eqrefeq:link-pmf for the probability mass functions and \eqrefeq:link-pgf for the probability generating functions).

A new relationship is provided by Proposition 1, where we illustrate how the rate function for the empirical means of total progenies can be expressed in terms of the analogous rate function for the empirical means of a single progeny. This is a quite natural problem to investigate large deviations, and, as we can expect, \eqrefeq:link-pgf has an important role in the proof; in fact, the large deviation rate function for empirical means of i.i.d. random variables (provided by Cramér’s theorem recalled below; see Theorem 1) is given by the Legendre transform of the logarithm of the (common) moment generating function of the random variables. Moreover, the relationship provided by Proposition 1 can have interest in information theory because the involved rate functions can be expressed in terms of suitable relative entropies (or Kullback–Leibler divergences); see, for example, [23] for a discussion on the rate function expressions in terms of the relative entropy.

Another result presented in this paper is Proposition 2, that is a version of Proposition 1, where the initial population $V_{0}$ is a random variable with a suitable distribution. Finally, in Propositions 3 and 4, we prove large deviation results for some estimators of the offspring mean $\mu$ in terms of i.i.d. replications of the total progeny and of the initial population (we are considering the case where the initial population $V_{0}$ is a random variable as in Proposition 2).

We conclude with the outline of the paper. We start with some preliminaries in Section 2. In Section 3, we prove the results concerning the large deviation rate functions related to Cramér’s theorem. Finally, in Section 4, we prove the large deviation results for the estimators of the offspring mean $\mu$ .

2 Preliminaries

We start with some preliminaries on the Galton–Watson process. In the second part, we recall some preliminaries on large deviations.

2.1 Preliminaries on Galton–Watson process

Here we introduce a slightly different notation, and, moreover, we recall some preliminaries in order to define the total progeny of a Galton–Watson process.

We start with some notation concerning the offspring distribution (note that $\mu_{f}$ defined further coincides with $\mu$ in the Introduction):

•

the probability mass function $p_{h}:=P(X_{n,i}=h)$ (for all integer $h\geq 0$ );

•

the probability generating function $f(s):=\sum_{h\geq 0}s^{h}p_{h}$ ;

•

the mean value $\mu_{f}:=\sum_{h\geq 0}hp_{h}$ (and we have $\mu_{f}=f^{\prime}(1)$ ).

Moreover, we introduce the analogous items for the initial population:

•

the probability mass function $\{q_{r}:r\geq 0\}$ (see \eqrefeq:pmf-initial-population);

•

the probability generating function $g(s):=\sum_{r\geq 0}s^{r}q_{r}$ ;

•

the mean value $\mu_{g}:=\sum_{r\geq 0}rq_{r}$ (and we have $\mu_{g}=g^{\prime}(1)$ ).

So, from now on, we consider the following slightly different notation:

[TABLE]

(in place of $\{V_{n}:n\geq 0\}$ presented before). More precisely:

•

the probability generating function of $V_{0}^{f,g}$ is $g$ (so $V_{0}^{f,g}$ does not depend on $f$ ), and therefore

[TABLE]

•

for a family of i.i.d. random variables $\{X_{n,i}:n,i\geq 1\}$ with probability generating function $f$ , we have

[TABLE]

Remark 1.

Note that $\{V_{n}^{f,g}:n\geq 0\}$ here corresponds to $\{V_{n}:n\geq 0\}$ presented in the Introduction if $q_{1}=1$ or, equivalently, if $g=\mathrm{id}$ (i.e. $g(s)=s$ for all $s$ ).

If we consider the extinction probability

[TABLE]

then it is known that we have

[TABLE]

moreover, if $p_{0}>0$ , then we have $p_{\mathrm{ext}}^{f,\mathrm{id}}=1$ if $\mu_{f}\leq 1$ and $p_{\mathrm{ext}}^{f,\mathrm{id}}\in(0,1)$ if $\mu_{f}>1$ . More generally, we have

[TABLE]

and, if $q_{0}<1$ (we obviously have $p_{\mathrm{ext}}^{f,g}=1$ if $q_{0}=1$ ), then we have the following cases:

[TABLE]

Then, if $p_{0}>0$ and $\mu_{f}\leq 1$ , then the random variable $Y^{f,g}$ defined by

[TABLE]

is almost surely finite and provides the total progeny of $\{V_{n}^{f,g}:n\geq 0\}$ . In view of what follows, we consider the probability generating function

[TABLE]

where $\{\pi_{k}^{f,g}:k\geq 0\}$ is the probability mass function of the random variable $Y^{f,g}$ . Moreover, we have the mean value

[TABLE]

in particular, $\nu^{f,g}=\frac{\mu_{g}}{1-\mu_{f}}$ even if $\mu_{f}=1$ , namely

[TABLE]

Finally, we recall some well-known connections between total progeny and offspring distributions (see e.g. [7]): for the probability mass functions, we have

[TABLE]

where $\{p_{h}^{*n}:h\geq 0\}$ is the $n$ th power of convolution of $\{p_{h}:h\geq 0\}$ ; for the probability generating functions, we have

[TABLE]

2.2 Preliminaries on large deviations

We start with the concept of large deviation principle (LDP). A sequence of random variables $\{W_{n}:n\geq 1\}$ taking values in a topological space $\mathcal{W}$ satisfies the LDP with rate function $I:\mathcal{W}\to[0,\infty]$ if $I$ is a lower semicontinuous function,

[TABLE]

and

[TABLE]

We also recall that a rate function $I$ is said to be good if all its level sets $\{\{w\in\mathcal{W}:I(w)\leq\eta\}:\eta\geq 0\}$ are compact.

Remark 2.

If $P(W_{n}\in S)=1$ for some closed set $S$ (at least eventually with respect to $n$ ), then $I(w)=\infty$ for $w\notin S$ ; this can be checked by taking the lower bound for the open set $O=S^{c}$ .

In particular, we refer to Cramér’s theorem on $\mathbb{R}^{d}$ (see e.g. Theorems 2.2.3and 2.2.30 in [6] for the cases $d=1$ and $d\geq 2$ ), and we recall its statement. We remark that, in this paper, we consider the cases $d=1$ (in such a case, the rate function need not to be a good rate function) and $d=2$ . Moreover, we use the symbol $\langle\cdot,\cdot\rangle$ for the inner product in $\mathbb{R}^{d}$ .

Theorem 1 (Cramér’s theorem)

*Let $\{W_{n}:n\geq 1\}$ be a sequence of i.i.d. $\mathbb{R}^{d}$ -valued random variables, and let $\{\bar{W}_{n}:n\geq 1\}$ be the sequence of empirical means defined by $\bar{W}_{n}:=\frac{1}{n}\sum_{k=1}^{n}W_{k}$ *(for all $n\geq 1)$ .

(i)* If $d=1$ , then $\{\bar{W}_{n}:n\geq 1\}$ satisfies the LDP with rate function $I$ defined by*

[TABLE]

(ii)* If $d\geq 2$ and the origin of $\mathbb{R}^{d}$ belongs to the interior of the set $\{\theta\in\mathbb{R}^{d}:\log\mathbb{E}[e^{\langle\theta,W_{1}\rangle}]<\infty\}$ , then $\{\bar{W}_{n}:n\geq 1\}$ satisfies the LDP with good rate function $I$ defined by*

[TABLE]

3 Applications of Cramér’s theorem

The aim of this section is to prove Propositions 1 and 2. In view of this, we recall Lemmas 1 and 2, which give two immediate applications of Cramér’s theorem (Theorem 1) with $d=1$ ; in Lemma 2, we consider the case with a unitary initial population almost surely (thus, as stated Remark 1, the case with $q_{1}=1$ or, equivalently, $g=\mathrm{id}$ ).

Lemma 1 (Cramér’s theorem for offspring distribution)

*Let $\{X_{n}:n\geq 1\}$ be i.i.d. random variables with probability generating function $f$ . Let $\{\bar{X}_{n}:n\geq 1\}$ be the sequence of empirical means defined by $\bar{X}_{n}:=\frac{1}{n}\sum_{k=1}^{n}X_{k}$ *(for all $n\geq 1)$ . Then $\{\bar{X}_{n}:n\geq 1\}$ satisfies the LDP with rate function $I_{f}$ defined by $I_{f}(x):=\sup_{\alpha\in\mathbb{R}}\{\alpha x-\log f(e^{\alpha})\}$ .

Lemma 2 (Cramér’s theorem for total progeny distribution with

$g=\mathrm{id}$ )

*Assume that $p_{0}>0$ and $\mu_{f}\leq 1$ . Let $\{Y_{n}:n\geq 1\}$ be i.i.d. random variables with probability generating function $\mathcal{G}_{f,\mathrm{id}}$ . Let $\{\bar{Y}_{n}:n\geq 1\}$ be the sequence of empirical means defined by $\bar{Y}_{n}:=\frac{1}{n}\sum_{k=1}^{n}Y_{k}$ *(for all $n\geq 1)$ . Then $\{\bar{Y}_{n}:n\geq 1\}$ satisfies the LDP with rate function $I_{\mathcal{G}_{f,\mathrm{id}}}$ defined by $I_{\mathcal{G}_{f,\mathrm{id}}}(y):=\sup_{\beta\in\mathbb{R}}\{\beta y-\log\mathcal{G}_{f,\mathrm{id}}(e^{\beta})\}$ .

Now we can prove our main results. We start with Proposition 1, which provides an expression for $I_{\mathcal{G}_{f,\mathrm{id}}}$ in terms of $I_{f}$ .

Proposition 1

Let $I_{f}$ and $I_{\mathcal{G}_{f,\mathrm{id}}}$ be the rate functions in Lemmas 1 and 2. Then we have $I_{\mathcal{G}_{f,\mathrm{id}}}(y)=yI_{f}(\frac{y-1}{y})$ for all $y\geq 1$ .

Proof.

We remark that

[TABLE]

where $\mathcal{D}(f):=\{\alpha\in\mathbb{R}:f(e^{\alpha})<\infty\}$ , and

[TABLE]

where $\mathcal{D}(\mathcal{G}_{f,\mathrm{id}}):=\{\beta\in\mathbb{R}:\mathcal{G}_{f,\mathrm{id}}(e^{\beta})<\infty\}$ , by Lemmas 1 and 2, respectively.

Moreover, the function $\alpha:\mathcal{D}(\mathcal{G}_{f,\mathrm{id}})\to\mathcal{D}(f)$ defined by

[TABLE]

is a bijection. This can be checked noting that $\alpha(\beta)\in\mathcal{D}(f)$ (for all $\beta\in\mathcal{D}(\mathcal{G}_{f,\mathrm{id}})$ ) because $f(e^{\alpha(\beta)})=f(\mathcal{G}_{f,\mathrm{id}}(e^{\beta}))=\frac{\mathcal{G}_{f,\mathrm{id}}(e^{\beta})}{e^{\beta}}<\infty$ (here we take into account \eqrefeq:link-pgf); moreover, its inverse $\beta:\mathcal{D}(f)\to\mathcal{D}(\mathcal{G}_{f,\mathrm{id}})$ is defined by

[TABLE]

(where $\mathcal{G}_{f,\mathrm{id}}^{-1}$ is the inverse of $\mathcal{G}_{f,\mathrm{id}}$ ), and $\beta(\alpha)\in\mathcal{D}(\mathcal{G}_{f,\mathrm{id}})$ (for all $\alpha\in\mathcal{D}(f)$ ) because $\mathcal{G}_{f,\mathrm{id}}(e^{\beta(\alpha)})=e^{\alpha}<\infty$ .

Thus, we can set $\alpha=\log\mathcal{G}_{f,\mathrm{id}}(e^{\beta})$ (for $\beta\in\mathcal{D}(\mathcal{G}_{f,\mathrm{id}})$ ) in the expression of $I_{f}(x)$ , and we get

[TABLE]

Then (we take into account \eqrefeq:link-pgf in the second equality below) {align*} I_f(x)&=sup_β∈D(G_f,id){logG_f,id(e^β)x-log(e^-βe^βf(G_f,id(e^β))}

=sup_β∈D(G_f,id){logG_f,id (e^β)x+β-logG_f,id (e^β)}

=sup_β∈D(G_f,id){β-(1-x)logG_f,id (e^β)},

and, for $x\in[0,1)$ , we get

[TABLE]

We conclude by taking $x=\frac{y-1}{y}$ for $y\geq 1$ (thus, $x\in[0,1)$ ), and we obtain the desired equality with some easy computations. ∎

Now we present Proposition 2, which concerns the LDP for the empirical means of i.i.d. bivariate random variables $\{(Y_{n},Z_{n}):n\geq 1\}$ distributed as $(Y^{f,g},V_{0}^{f,g})$ . In particular, we obtain an expression for the rate function $I_{\mathcal{G}_{f,g},g}$ in terms of $I_{f}$ in Lemma 1 and $I_{g}$ defined by

[TABLE]

Proposition 2

*Let $\{(Y_{n},Z_{n}):n\geq 1\}$ be i.i.d. random variables distributed as $(Y^{f,g},V_{0}^{f,g})$ . Assume that $\mathbb{E}[e^{\beta Y^{f,g}+\gamma V_{0}^{f,g}}]$ is finite in a neighborhood of $(\beta,\gamma)=(0,0)$ . Let $\{(\bar{Y}_{n},\bar{Z}_{n}):n\geq 1\}$ be the sequence of empirical means defined by $(\bar{Y}_{n},\bar{Z}_{n}):=(\frac{1}{n}\sum_{k=1}^{n}Y_{k},\frac{1}{n}\sum_{k=1}^{n}Z_{k})$ *(for all $n\geq 1)$ . Then $\{(\bar{Y}_{n},\bar{Z}_{n}):n\geq 1\}$ satisfies the LDP with good rate function $I_{\mathcal{G}_{f,g},g}$ defined by

[TABLE]

Remark 3.

We are assuming (implicitly) that $p_{0}>0$ and $\mu_{f}\leq 1$ ; in fact, since we require that $\mathbb{E}[e^{\beta Y^{f,g}+\gamma V_{0}^{f,g}}]$ is finite in a neighborhood of $(\beta,\gamma)=(0,0)$ , we are assuming that $\mu_{f}<1$ and $\mu_{g}<\infty$ .

Proof.

The LDP is a consequence of Cramér’s theorem (Theorem 1) with $d=2$ , and the rate function $I_{\mathcal{G}_{f,g},g}$ is defined by

[TABLE]

Throughout the proof, we restrict our attention on the pairs $(y,z)$ such that $y\geq z\geq 0$ . In fact, almost surely, we have $Y^{f,g}\geq V_{0}^{f,g}\geq 0$ , and therefore $\bar{Y}_{n}\geq\bar{Z}_{n}\geq 0$ ; thus, by Remark 2 we have $I_{\mathcal{G}_{f,g},g}(y,z)=\infty$ if condition $y\geq z\geq 0$ fails.

We remark that $\mathbb{E}[s^{Y^{f,g}}|V_{0}^{f,g}]=(\mathcal{G}_{f,\mathrm{id}}(s))^{V_{0}^{f,g}}$ , and therefore

[TABLE]

thus,

[TABLE]

Furthermore, the function

[TABLE]

is a bijection defined on $\mathcal{D}(\mathcal{G}_{f,\mathrm{id}})\times\mathbb{R}$ , where

[TABLE]

as in the proof of Proposition 1; then, for $\delta:=\gamma+\log\mathcal{G}_{f,\mathrm{id}}(e^{\beta})$ , we obtain

[TABLE]

Thus, we have (note that the last equality holds by Proposition 1) {align*} I_G_f,g,g(y,z)&≤sup_β∈R {βy+zlogG_f,id(e^β) }+ sup_δ∈R {δz-logg(e^δ) }

= { zI_G_f,id(y/z)+I_g(z)if y≥z>0,I_g(0)if y=z=0,∞otherwise.

= { yI_f (y-zy)+I_g(z)if y≥z>0,I_g(0)if y=z=0,∞otherwise.

We conclude by showing the inverse inequality

[TABLE]

To this end, we take two sequences $\{\beta_{n}:n\geq 1\}$ and $\{\delta_{n}:n\geq 1\}$ such that

[TABLE]

and

[TABLE]

Then we have

[TABLE]

and we get \eqrefeq:inverse-inequality letting $n$ go to infinity. ∎

4 Large deviations for estimators of $\mu_{f}$

In this section, we prove two LDPs for two sequences of estimators of the offspring mean $\mu_{f}$ . Namely, if $\{(\bar{Y}_{n},\bar{Z}_{n}):n\geq 1\}$ is the sequence in Proposition 2 (see also the precise assumptions in Remark 3; in particular, we have $\mu_{f}<1$ ), then we consider:

$\{\frac{\bar{Y}_{n}-\bar{Z}_{n}}{\bar{Y}_{n}}:n\geq 1\}$ ; 2. 2.

$\{\frac{\bar{Y}_{n}-\mu_{g}}{\bar{Y}_{n}}:n\geq 1\}$ .

Obviously, these estimators are well defined if the denominators $\bar{Y}_{n}$ are different from zero; then, in order to have well-defined estimators, we always assume that $q_{0}=0$ (where $q_{0}$ is as in \eqrefeq:pmf-initial-population), and, noting that, in general, $I_{g}(0)=-\log q_{0}$ , we have

[TABLE]

Moreover, both sequences converge to $\frac{\nu^{f,g}-\mu_{g}}{\nu^{f,g}}=\mu_{f}$ as $n\to\infty$ (see $\nu^{f,g}$ in \eqrefeq:mean-value-total-progeny), and they coincide when the initial population is deterministic (equal to $\mu_{g}$ almost surely).

The LDPs of these two sequences are proved in Propositions 3 and 4. Moreover, Corollary 1 and Remark 4 concern the comparison between the convergence of the first sequence $\{\frac{\bar{Y}_{n}-\bar{Z}_{n}}{\bar{Y}_{n}}:n\geq 1\}$ and its analogue when the initial population is deterministic (equal to the mean). Propositions 3 and 4 are proved by combining the contraction principle (see e.g. Theorem 4.2.1 in [6]) and Proposition 2 (note that the rate function $I_{\mathcal{G}_{f,g},g}$ in Proposition 2 is good, as it is required to apply the contraction principle). We remark that, in the proofs of Propositions 3 and 4, we take into account that $I_{\mathcal{G}_{f,g},g}(0,0)=\infty$ by Proposition 2 and $I_{g}(0)=\infty$ . At the end of this section, we present some remarks on the comparison between the rate functions in Propositions 3 and 4 (Remarks 5 and 6).

We start with the LDP of the first sequence of estimators.

Proposition 3

Assume the same hypotheses of Proposition 2 and $q_{0}=0$ . Let $\{(Y_{n},Z_{n}):n\geq 1\}$ be i.i.d. random variables distributed as $(Y^{f,g},V_{0}^{f,g})$ . Let $\{(\bar{Y}_{n},\bar{Z}_{n}):n\geq 1\}$ be the sequence of empirical means defined by $(\bar{Y}_{n},\bar{Z}_{n}):=(\frac{1}{n}\sum_{k=1}^{n}Y_{k},\frac{1}{n}\sum_{k=1}^{n}Z_{k})$ (for all $n\geq 1$ ). Then $\{\frac{\bar{Y}_{n}-\bar{Z}_{n}}{\bar{Y}_{n}}:n\geq 1\}$ satisfies the LDP with good rate function $J_{\mathcal{G}_{f,g},g}$ defined by

[TABLE]

Proof.

By Proposition 2 and the contraction principle we have the LDP of $\{\frac{\bar{Y}_{n}-\bar{Z}_{n}}{\bar{Y}_{n}}:n\geq 1\}$ with good rate function $J_{\mathcal{G}_{f,g},g}$ defined by

[TABLE]

The case $x\notin[0,1)$ is trivial because we have the infimum over the empty set. For $x\in[0,1)$ , we rewrite this expression as follows (where we take into account the expression of the rate function $I_{\mathcal{G}_{f,g},g}$ in Proposition 2): {align*} J_G_f,g,g(x)&=inf{I_G_f,g,g (z1-x,z ):z>0 }

=inf{z1-xI_f (z1-x-zz1-x )+I_g(z):z>0 }

=inf{z1-xI_f(x)+I_g(z):z>0 }

=-sup{-zIf(x)1-x-I_g(z):z>0 };

thus, since $I_{g}(z)=\infty$ for $z\leq 0$ , we obtain $J_{\mathcal{G}_{f,g},g}(x)=-\log g(e^{-\frac{I_{f}(x)}{1-x}})$ by taking into account the definition of $I_{g}$ in \eqrefdef:rf-initial-population and the well-known properties of Legendre transforms (see e.g. Lemma 4.5.8 in [6]; see also Lemma 2.2.5(a) and Exercise 2.2.22 in [6] for the convexity and the lower semicontinuity of $\gamma\mapsto\log g(e^{\gamma})$ ). ∎

We have an immediate consequence of this proposition that concerns the case with a deterministic initial population equal to $\mu_{g}$ (almost surely). Namely, if we consider the probability generating function $g_{\diamond}$ defined by $g_{\diamond}(s):=s^{\mu_{g}}$ (for all $s$ ), then we mean the case $g=g_{\diamond}$ , and therefore:

•

$V_{0}^{f,g_{\diamond}}=\mu_{g}$ almost surely; thus, $Z_{n}=\mu_{g}$ and $\bar{Z}_{n}=\mu_{g}$ almost surely (for all $n\geq 1$ );

•

$\{Y_{n}^{f,g_{\diamond}}:n\geq 1\}$ are i.i.d. random variables distributed as $Y^{f,g_{\diamond}}$ , that is,

[TABLE]

•

the rate function $J_{\mathcal{G}_{f,g_{\diamond}},g_{\diamond}}$ is

[TABLE]

by Proposition 3.

Corollary 1 (Comparison between $J_{\mathcal{G}_{f,g},g}$ in Proposition

3 and $J_{\mathcal{G}_{f,g_{\diamond}},g_{\diamond}}$ )

We have $J_{\mathcal{G}_{f,g},g}(x)\leq J_{\mathcal{G}_{f,g_{\diamond}},g_{\diamond}}(x)$ for all $x\in\mathbb{R}$ . Moreover the inequality turns into an equality if and only if we have one of the following cases:

•

$x\notin[0,1)$ * and $J_{\mathcal{G}_{f,g},g}(x)=J_{\mathcal{G}_{f,g_{\diamond}},g_{\diamond}}(x)=\infty$ ;*

•

$x=\mu_{f}$ * and $J_{\mathcal{G}_{f,g},g}(x)=J_{\mathcal{G}_{f,g_{\diamond}},g_{\diamond}}(x)=0$ ;*

•

$V_{0}^{f,g}$ * is deterministic, equal to $\mu_{g}$ , and $J_{\mathcal{G}_{f,g},g}(x)=J_{\mathcal{G}_{f,g_{\diamond}},g_{\diamond}}(x)$ for all $x\in\mathbb{R}$ .*

Proof.

The case $x\notin[0,1)$ is trivial. On the contrary, if $x\in[0,1)$ , then by Jensen’s inequality we have

[TABLE]

moreover, the cases where the inequality turns into an equality follow from the well-known properties of Jensen’s inequality. ∎

Remark 4 (Comparison between convergence of estimators of $\mu_{f}$ ).

Assume that $\mu_{f}>0$ and the initial population is not deterministic. Then there exists $\eta>0$ such that

[TABLE]

Thus, we can say that $\{\frac{\bar{Y}_{n}^{f,g_{\diamond}}-\mu_{g}}{\bar{Y}_{n}^{f,g_{\diamond}}}:n\geq 1\}$ converges to $\mu_{f}$ (as $n\to\infty)$ faster than $\{\frac{\bar{Y}_{n}^{f,g}-\bar{Z}_{n}}{\bar{Y}_{n}^{f,g}}:n\geq 1\}$ ; in fact, we can find $\varepsilon>0$ such that

[TABLE]

We can repeat the same argument to say that $\{\frac{\bar{Y}_{n}^{f,g_{\diamond}}-\mu_{g}}{\bar{Y}_{n}^{f,g_{\diamond}}}:n\geq 1\}$ converges to $\mu_{f}$ (as $n\to\infty)$ faster than $\{\bar{X}_{n}:n\geq 1\}$ in Lemma 1. In fact, we have $V_{0}^{f,g_{\diamond}}=\mu_{g}$ almost surely, $\mu_{g}$ is an integer, and, since $\mu_{g}>0$ because $q_{0}=0$ , we have $\mu_{g}\geq 1$ ; then we have

[TABLE]

(we can also consider the case $x=0$ if $\mu_{g}>1)$ .

Now we present the LDP for the second sequence of estimators.

Proposition 4

*Assume the same hypotheses of Proposition 2 and $q_{0}=0$ . Let $\{Y_{n}:n\geq 1\}$ be i.i.d. random variables distributed as $Y^{f,g}$ . Let $\{\bar{Y}_{n}:n\geq 1\}$ be the sequence of empirical means defined by $\bar{Y}_{n}:=\frac{1}{n}\sum_{k=1}^{n}Y_{k}$ *(for all $n\geq 1)$ . Then $\{\frac{\bar{Y}_{n}-\mu_{g}}{\bar{Y}_{n}}:n\geq 1\}$ satisfies the LDP with good rate function $J_{\mu_{g}}$ defined by

[TABLE]

Proof.

By Proposition 2 and the contraction principle we have the LDP of $\{\frac{\bar{Y}_{n}-\mu_{g}}{\bar{Y}_{n}}:n\geq 1\}$ with good rate function $J_{\mu_{g}}$ defined by

[TABLE]

The case $x\geq 1$ is trivial because we have the infimum over the empty set (we recall that $\mu_{g}>0$ because $q_{0}=0$ ). For $x<1$ , we have

[TABLE]

and we obtain the desired formula by taking into account the expression of the rate function $I_{\mathcal{G}_{f,g},g}$ in Proposition 2. ∎

Remark 5 (We can have $J_{\mu_{g}}(x)<\infty$ for some $x<0$ ).

We know that, for $J_{\mathcal{G}_{f,g},g}$ in Proposition 3, we have $J_{\mathcal{G}_{f,g},g}(x)=\infty$ for $x\notin[0,1)$ . On the contrary, as we see, we could have $J_{\mu_{g}}(x)<\infty$ for some $x<0$ . In order to explain this fact, we denote the minimum value $r$ such that $q_{r}>0$ by $r_{\mathrm{min}}$ ; then we have $\mu_{g}\geq r_{\mathrm{min}}$ ; moreover, we have $\mu_{g}>r_{\mathrm{min}}$ if $q_{r_{\mathrm{min}}}<1$ . In conclusion, we can say that if $\mu_{g}>r_{\mathrm{min}}$ , then the range of negative values $x$ such that $J_{\mu_{g}}(x)<\infty$ is

[TABLE]

in fact, for $x<1$ , both $I_{f}(\frac{\frac{\mu_{g}}{1-x}-z}{\frac{\mu_{g}}{1-x}})$ and $I_{g}(z)$ are finite for $z\in[r_{\mathrm{min}},\frac{\mu_{g}}{1-x}]$ , and therefore we can say that $J_{\mu_{g}}(x)<\infty$ if $r_{\mathrm{min}}\leq\frac{\mu_{g}}{1-x}$ or, equivalently, if \eqrefeq:range-of-negative-x holds.

Remark 6 (Estimators of $\mu_{f}$ when $\mu_{f}=0$ ).

If $\mu_{f}=0$ , that is, $f(s)=1$ for all $s$ or, equivalently, $p_{0}=1$ , then the rate function in Proposition 3 is

[TABLE]

Then it is easy to check that $J_{\mathcal{G}_{f,g},g}$ coincides with $I_{f}$ , and therefore $J_{\mathcal{G}_{f,g},g}$ coincides with $J_{\mathcal{G}_{f,g_{\diamond}},g_{\diamond}}$ in \eqrefeq:main-estimators-rf-deterministic-initial-population (note that, in particular, we cannot have the strict inequalities in \eqrefeq:local-strict-inequality-between-rf in Remark 4 stated for the case $\mu_{f}>0$ ). Finally, if $\mu_{f}=0$ (and as usual $q_{0}=0$ or, equivalently, $\mu_{g}>0$ ), then we have $z=\frac{\mu_{g}}{1-x}$ in the variational formula of the rate function in Proposition 4, and therefore

[TABLE]

Note the rate function in \eqrefeq:rf-prop-minor-estimators-muf=0 can also be derived by combining the contraction principle and the rate function $I_{g}$ for the empirical means $\{\bar{Z}_{n}:n\geq 1\}$ ; in fact, we have $\{\frac{\bar{Y}_{n}-\mu_{g}}{\bar{Y}_{n}}:n\geq 1\}=\{\frac{\bar{Z}_{n}-\mu_{g}}{\bar{Z}_{n}}:n\geq 1\}$ , and the rate function $I_{g}$ is good by the hypotheses of Proposition 4 (see Proposition 2 and Remark 3). Finally, we also note that inequality \eqrefeq:range-of-negative-x appears in the rate function expression \eqrefeq:rf-prop-minor-estimators-muf=0.

Acknowledgments

The authors thank a referee for suggesting shorter proofs of Propositions 1 and 2. The support of GNAMPA (INDAM) is acknowledged.

Bibliography23

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] {bbook} \bauthor \bsnm Asmussen, \binits S., \bauthor \bsnm Hering, \binits H.: \bbtitle Branching Processes. \bpublisher Birkhäuser, \blocation Boston ( \byear 1983). \bid doi=10.1007/978-1-4615-8155-0, mr=0701538 \Orig Bib Text Asmussen S., Hering H. (1983) Branching Processes. Birkhäuser, Boston. \end Orig Bib Text \bptok structpyb \endbibitem
2[2] {barticle} \bauthor \bsnm Athreya, \binits K.B.: \batitle Large deviation rates for branching processes. I. Single type case. \bjtitle Ann. Appl. Probab. \bvolume 4, \bfpage 779– \blpage 790 ( \byear 1994). \bid doi=10.1214/aoap/1177004971, mr=1284985 \Orig Bib Text Athreya K.B. (1994) Large deviation rates for branching processes. I. Single type case. Ann. Appl. Probab. 4, 779–790. \end Orig Bib Text \bptok structpyb \endbibitem
3[3] {bbook} \bauthor \bsnm Athreya, \binits K.B., \bauthor \bsnm Ney, \binits P.E.: \bbtitle Branching Processes. \bpublisher Springer, \blocation New York, Heidelberg ( \byear 1972). \bid mr=0373040 \Orig Bib Text Athreya K.B., Ney P.E. (1972) Branching Processes. Springer-Verlag, New York-Heidelberg. \end Orig Bib Text \bptok structpyb \endbibitem
4[4] {barticle} \bauthor \bsnm Athreya, \binits K.B., \bauthor \bsnm Vidyashankar, \binits A.N.: \batitle Large deviation rates for branching processes. II. The multitype case. \bjtitle Ann. Appl. Probab. \bvolume 5, \bfpage 566– \blpage 576 ( \byear 1995). \bid doi=10.1214/ aoap/1177004778, mr=1336883 \Orig Bib Text Athreya K.B., Vidyashankar A.N. (1995) Large deviation rates for branching processes. II. The multitype case. Ann. Appl. Probab. 5, 566–576. \end Orig Bib Text \bptok st
5[5] {barticle} \bauthor \bsnm Biggins, \binits J.D., \bauthor \bsnm Bingham, \binits N.H.: \batitle Large deviations in the supercritical branching process. \bjtitle Adv. Appl. Probab. \bvolume 25, \bfpage 757– \blpage 772 ( \byear 1993). \bid doi=10.1017/S 0001867800025738, doi=10.2307/1427790, mr=1241927 \Orig Bib Text Biggins J.D., Bingham N.H. (1993) Large deviations in the supercritical branching process. Adv. in Appl. Probab. 25, 757–772. \end Orig Bib Text \bptok structpyb \en
6[6] {bbook} \bauthor \bsnm Dembo, \binits A., \bauthor \bsnm Zeitouni, \binits O.: \bbtitle Large Deviations Techniques and Applications, \bedition 2nd edn. \bpublisher Springer, \blocation New York ( \byear 1998). \bid doi=10.1007/978-1-4612-5320-4, mr=1619036 \Orig Bib Text Dembo A., Zeitouni O. (1998) Large Deviations Techniques and Applications (2nd Edition). Springer. New York. \end Orig Bib Text \bptok structpyb \endbibitem
7[7] {barticle} \bauthor \bsnm Dwass, \binits M.: \batitle The total progeny in a branching process and a related random walk. \bjtitle J. Appl. Probab. \bvolume 6, \bfpage 682– \blpage 686 ( \byear 1969). \bid doi=10.1017/S 0021900200026711, mr=0253433 \Orig Bib Text Dwass M. (1969). The total progeny in a branching process and a related random walk. J. Appl. Probability 6, 682–686. \end Orig Bib Text \bptok structpyb \endbibitem
8[8] {barticle} \bauthor \bsnm Fleischmann, \binits K., \bauthor \bsnm Wachtel, \binits V.: \batitle Lower deviation probabilities for supercritical Galton–Watson processes. \bjtitle Ann. Inst. Henri Poincaré Probab. Stat. \bvolume 43, \bfpage 233– \blpage 255 ( \byear 2007). \bid doi=10.1016/j.anihpb.2006.03.001, mr=2303121 \Orig Bib Text Fleischmann K., Wachtel V. (2007) Lower deviation probabilities for supercritical Galton–Watson processes. Ann. Inst. H. Poincaré Probab. Statist

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Large deviations for i.i.d. replications of the total progeny

Abstract

doi:

keywords:

keywords:

1 Introduction

2 Preliminaries

2.1 Preliminaries on Galton–Watson process

Remark 1**.**

2.2 Preliminaries on large deviations

Remark 2**.**

Theorem 1** **(Cramér’s theorem)

3 Applications of Cramér’s theorem

Lemma 1** **(Cramér’s theorem for offspring distribution)

Lemma 2** **(Cramér’s theorem for total progeny distribution with

Proposition 1

Proof.

Proposition 2

Remark 3**.**

Proof.

4 Large deviations for estimators of μf\mu_{f}μf​

Proposition 3

Proof.

Corollary 1** **(Comparison between JGf,g,gJ_{\mathcal{G}_{f,g},g}JGf,g​,g​ in Proposition

Proof.

Remark 4** (Comparison between convergence of estimators of μf\mu_{f}μf​).**

Proposition 4

Proof.

Remark 5** (We can have Jμg(x)<∞J_{\mu_{g}}(x)<\inftyJμg​​(x)<∞ for some x<0x<0x<0).**

Remark 6** (Estimators of μf\mu_{f}μf​ when μf=0\mu_{f}=0μf​=0).**

Acknowledgments

Remark 1.

Remark 2.

Theorem 1 (Cramér’s theorem)

Lemma 1 (Cramér’s theorem for offspring distribution)

Lemma 2 (Cramér’s theorem for total progeny distribution with

Remark 3.

4 Large deviations for estimators of $\mu_{f}$

Corollary 1 (Comparison between $J_{\mathcal{G}_{f,g},g}$ in Proposition

Remark 4 (Comparison between convergence of estimators of $\mu_{f}$ ).

Remark 5 (We can have $J_{\mu_{g}}(x)<\infty$ for some $x<0$ ).

Remark 6 (Estimators of $\mu_{f}$ when $\mu_{f}=0$ ).