Sesqui-type branching processes
Svante Janson, Oliver Riordan, Lutz Warnke

TL;DR
This paper analyzes a special two-type branching process where only one type reproduces, providing key estimates for survival and total particles, which are crucial for understanding certain random graph processes.
Contribution
It introduces and analyzes a two-type branching process with a barren type, offering new estimates for survival probability and total particles, relevant for bounded-size Achlioptas processes.
Findings
Derived survival probability estimates
Established tail bounds for total particles
Linked results to bounded-size Achlioptas processes
Abstract
We consider branching processes consisting of particles (individuals) of two types (type L and type S) in which only particles of type L have offspring, proving estimates for the survival probability and the (tail of) the distribution of the total number of particles. Such processes are in some sense closer to single- than to multi-type branching processes. Nonetheless, the second, barren, type complicates the analysis significantly. The results proved here (about point and survival probabilities) are a key ingredient in the analysis of bounded-size Achlioptas processes in a recent paper by the last two authors.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
††footnotetext: AMS 2010 Mathematics Subject Classification: 60J80
Sesqui-type branching processes
Svante Janson and Oliver Riordan and Lutz Warnke Department of Mathematics, Uppsala University, PO Box 480, SE-751 06 Uppsala, Sweden. E-mail: [email protected]. Partly supported by the Knut and Alice Wallenberg Foundation. Mathematical Institute, University of Oxford, Radcliffe Observatory Quarter, Woodstock Road, Oxford OX2 6GG, UK. E-mail: [email protected]. School of Mathematics, Georgia Institute of Technology, Atlanta GA 30332, USA; and Peterhouse, Cambridge CB2 1RD, UK. E-mail: [email protected].
(June 24, 2017)
Abstract
We consider branching processes consisting of particles (individuals) of two types (type and type ) in which only particles of type have offspring, proving estimates for the survival probability and the (tail of) the distribution of the total number of particles. Such processes are in some sense closer to single- than to multi-type branching processes. Nonetheless, the second, barren, type complicates the analysis significantly. The results proved here (about point and survival probabilities) are a key ingredient in the analysis of bounded-size Achlioptas processes in a recent paper by the last two authors.
1 Introduction
Throughout the paper we consider branching processes in which every particle is of one of two types, called (for compatibility with the notation in [22]), ‘type ’ and ‘type ’. Particles of type may be thought of as barren: they have no children. Each particle of type will have some random number of children of each type; as usual, we have independence between the children of different particles, but the numbers and of type- and type- children of one particle need not be independent. The formal definition is as follows.
Definition 1.1**.**
Let and be probability distributions on . We write for the Galton–Watson branching process started with a single particle of type , in which each particle of type has children of type and of type . Particles of type have no children, and the children of different particles are independent. We write for the branching process defined as follows: start in generation one with particles of type and of type . Those of type have children according to , independently of each other and of the first generation. Those of type have no children. We write () for the total number particles in .
These branching processes are in some sense essentially single-type: one could first generate the tree of type- particles as a classical single-type Galton–Watson process, and then consider particles of type . However, since the numbers of type- and type- children are not necessarily independent, this two-stage description does not seem particularly easy to work with.
The motivation for considering such processes (and in particular for allowing a different rule for the first generation) comes from the application to studying the phase transition in Achlioptas processes in [22]. Achlioptas processes are evolving random graph models that have received considerable attention (see, e.g., [1; 19; 4; 24; 14; 20; 15; 3; 21] and the references therein). We shall say nothing further about these random graph processes here, aiming to keep the paper self-contained, and purely about branching processes.
We shall prove two main results. Firstly, in Section 2, we consider an individual branching process of the type above, giving an asymptotic formula for the point probability under certain conditions on the distributions and . This formula is proved in Sections 2.1–2.3, which are the heart of the paper. Then, in Section 3, we consider families of processes where the offspring distribution varies analytically in an additional parameter . Roughly speaking, we show that the key quantities in the formula in Section 2 then vary analytically in . This result (which in particular implies properties of the near-critical case) is needed in [22]. Finally, in Section 4, we prove corresponding results for the survival probability . Here the barren type plays no role, so the results effectively concern single-type processes and are much simpler.
Remark 1.2**.**
Although the definition of sesqui-type branching processes is adapted to the application in [22], the results here are applicable, at least in principle, to a more general class of branching processes. Consider a finite-type Galton–Watson process in which there is one special type (type ), and all other types are ‘doomed’ (lead to finite trees of descendants a.s.). Such a process may be transformed into a sesqui-type process in a natural way: for each type- particle replace its children of all doomed types, and their (necessarily doomed) descendents, by type- children (keeping the same total number of particles). For our results to apply to the transformed process we need further conditions, roughly speaking that the ‘doomed’ subtrees are not too close to critical; but in outline, all processes with (at most) one type that can potentially survive are covered. Branching processes of this type (with one doomed type) have been studied by several authors, giving various results different from ours; see for example [23; 25; 7].
1.1 Some notation and conventions
Throughout we write for the non-negative integers.
Given a two-dimensional random variable taking values in , we denote its bivariate probability generating function by
[TABLE]
for all complex and such that the expectation (or sum) converges absolutely. We will also consider the bivariate moment generating function
[TABLE]
When considering a particular branching process as in Definition 1.1, we often write and for brevity.
We denote the coefficient of in a power series by .
We say that a function defined on is analytic if for every there is an and a power series with radius of convergence at least such that and coincide on . A function defined on some domain including is analytic on if is analytic. The definitions for functions of several real or complex variables are analogous.
If is an analytic function of variables, defined in an open set , we denote its derivative by , and its th derivative by . Note that is an analytic function from to the linear space of all (symmetric) -linear forms . In particular, for each , is a linear form, which can also be regarded as a vector (the usual gradient); we write , so Df(z)=\bigl{(}D_{1}f(z),\dots,D_{d}f(z)\bigr{)}. Similarly, is a bilinear form, which may be regarded as a matrix with entries , where . We denote its determinant by . (This is known as the Hessian of .)
For a vector , let denote , where the vector is repeated times. When using coordinates in the case , we write for , so, regarding as a matrix and as a (column) vector, we have
[TABLE]
We denote the usual Euclidean norm of vectors by . For operators and the multilinear forms we use for the usual norm (any other norm would do as well).
For real symmetric matrices, means that is positive definite, i.e., that for all real vectors . In particular, if is a symmetric matrix and , then
[TABLE]
Remark 1.3**.**
We adopt the following notational convention regarding constants. and are used ‘locally’ (within a single proof), while numbered constants , etc retain their meaning throughout the paper. The constants , which are numbered in the order they are introduced, obey the inequalities
[TABLE]
We write , , for complex variables, and , , , for real variables. All constants , etc are positive.
2 Point probabilities of a single branching process
In this section we study the point probabilities of the branching process from Definition 1.1. To formulate our main result we need some further definitions (which encapsulate fairly mild and natural conditions for the offspring distributions).
Definition 2.1**.**
Suppose that , , and .
- (i)
Let be the set of probability distributions on such that if , then
[TABLE] 2. (ii)
Let be the set of probability distributions on such that
[TABLE] 3. (iii)
Let .
We write if the distribution of is in , and similarly for and . The key condition here is the (uniform) bound (2.1) on the probability generating functions. The condition (2.3) is needed, roughly speaking, to ensure that is not essentially supported on a sublattice of . Note that trivially implies
[TABLE]
and similarly .
The following theorem gives the qualitative behaviour of the size– point probabilities of the branching process from Definition 1.1. The statement of Theorem 2.2 is not self contained since the parameters , and are defined (in a rather involved way) from the generating functions of and , see (2.43)–(2.44) and Lemma 2.15 in Section 2.3. A key feature of the result is that the estimates and error-terms are uniform over all distributions and , i.e., the explicit and implicit constants depend only on and . Note that, from (2.8) below, if and only if , and that decays exponentially in in the near-critical case .
Theorem 2.2** (Point probabilities of ).**
Suppose that , , , and . Writing and , there exists a constant such that if , , and , then for all we have
[TABLE]
where, defining and as in (2.43)–(2.44) and as in Lemma 2.15, we have
[TABLE]
and
[TABLE]
Moreover, the implicit constants in (2.5)–(2.8) depend only on and .
The remainder of this section is devoted to the proof of Theorem 2.2. To this end we fix , , , and , and write and to avoid clutter. Let and denote the total numbers of type- and type- particles in , so , and set
[TABLE]
Of course, depends on the distributions of and . In Section 2.1 we establish a simple integral formula for . Then, in Section 2.2 we use a version of the saddle point method to estimate this integral asymptotically. Finally, in Section 2.3 we prove (2.5) by summing all with .
2.1 An integral formula for
In this section we derive an explicit integral formula for , see (2.14). We start with a simple conditional version of the classical Otter–Dwass formula (see e.g. Dwass [8]), which hinges on the random walk representation of a branching process and a well-known random-walk hitting time result.
Lemma 2.3**.**
For all integers and ,
[TABLE]
Proof.
Let be independent with each pair having the same distribution as . Since particles of type do not have any children, by exploring the branching process in the usual way (i.e., revealing the offspring of the particles of type one-by-one until none are left to explore), we have
[TABLE]
That the right-hand side of the above expression equals (2.10) is surely folklore (by conditioning on this also follows directly from [17, Theorem 7]); we include a short argument. Namely, by a version of the well-known Cyclic Lemma (sometimes also called Spitzer’s combinatorial lemma), see, e.g., [13, Lemma 15.3] or [18, Lemma 6.1], for any sequence with and , there are exactly cyclic shifts of for which all corresponding partial sums of length satisfy . Hence, taking a uniformly random cyclic shift of the independent variables , the formula (2.10) follows. ∎
Remark 2.4**.**
This two-type version of the Otter–Dwass formula is a simple variation of the usual one-type case; this is because one type is barren and can essentially be ignored. For a much more complicated formula in the general multi-type case, see Chaumont and Liu [5].
The probability on the right-hand side of (2.10) can be expressed using generating functions as
[TABLE]
For and , recalling the notation (2.9) and summing (2.10) over all , we thus obtain
[TABLE]
where
[TABLE]
For later use, we also define
[TABLE]
Remark 2.5**.**
Let G(y,z):=\operatorname{\mathbb{E}{}}\bigl{(}y^{|\mathfrak{X}^{L}|}z^{|\mathfrak{X}^{S}|}\bigr{)} be the bivariate generating function for the size of the branching process , and let G_{1}(y,z):=\operatorname{\mathbb{E}{}}\bigl{(}y^{|\mathfrak{X}^{1,L}|}z^{|\mathfrak{X}^{1,S}|}\bigr{)} be the corresponding generating function when starting with a single particle of type . Then and , and the formula (2.12) can alternatively be obtained by the Lagrange inversion formula in the Bürmann form, see e.g. [9, A.(14)], regarding the generating functions as (formal) power series in with coefficients that are power series in . We omit the details.
The extraction of coefficients in (2.12) can be performed by complex integration in the usual way (e.g., using Cauchy’s integral formula to evaluate \frac{\partial^{n+m}}{\partial y^{n}\partial z^{m}}\bigl{(}\tilde{g}_{0}(y,z)g(y,z)^{n}\bigr{)}\big{|}_{y=z=0}=n!\,m!\,np_{n,m} as in the textbook proof of Cauchy’s estimates), yielding the formula
[TABLE]
where we integrate (for example) over two circles with centre [math] and radii such that and are defined. In particular, if and are both in , then for any we can integrate over and , and the standard change of variables , then yields
[TABLE]
Remark 2.6**.**
Alternatively, (2.14) can be obtained from (2.10) by first considering suitably tilted versions of the random variables (cf. Cramér [6]), and then passing to characteristic functions and making a Fourier inversion.
Remark 2.7**.**
It is not hard to write an integral formula for the final probability that we are aiming to estimate. For example, multiplying (2.12) by and summing we see that , where . Thus one can find by extracting the coefficient of in . However, the corresponding integral does not obviously lend itself to asymptotic evaluation by methods such as those used here. Still, a direct estimate of may perhaps be possible by appropriate singularity analysis.
2.2 An asymptotic estimate of
In this section we estimate the integral (2.14) asymptotically (see Theorem 2.11 below), using parameters defined in terms of the moment generating function f(y,z)=f_{Y,Z}(y,z)=\operatorname{\mathbb{E}{}}\bigl{(}e^{yY+zZ}\bigr{)}. Whenever is defined and non-zero, let
[TABLE]
taking the principal value of the logarithm; we shall only consider on domains on which . The next lemma simply states that in suitable domains, , and their (partial) derivatives are all bounded.
Lemma 2.8**.**
There exist constants and , , such that if and , then the following hold.
- (i)
If with , then . 2. (ii)
If, in addition, , then is defined, and . 3. (iii)
If , then .
Proof.
(i): When , then , which is at most by assumption. Thus when and . Recall that by assumption, so . For any , say, for suitable statement (i) follows by standard Cauchy estimates
(ii): Let denote the constant from the above proof of (i). Set . Since , it follows from (i) that if , then
[TABLE]
so is defined and bounded. Furthermore, after decreasing and increasing , if necessary, the bounds for the derivatives now again follow by Cauchy’s estimates.
(iii): Let . By our assumption (2.2), . Furthermore, for by part (i). Consequently, after reducing if necessary, we have for , . ∎
The next lemma expresses, in a quantitative form, the unsurprising fact that if we evaluate the probability generating function at , which are not positive real numbers, then there is significant cancellation, i.e., is significantly smaller than . It will be more convenient to write this in terms of the moment generating function rather than .
Lemma 2.9**.**
There exists a constant such that if and with and , then
[TABLE]
Proof.
Let . Then
[TABLE]
and thus . Then
[TABLE]
Each term on the right-hand side is non-negative, and considering just the cases and , recalling (2.3) we obtain
[TABLE]
since for . Moreover, by Lemma 2.8(i), . Consequently
[TABLE]
for some constant , and thus
[TABLE]
establishing (2.17) since . ∎
We next establish that the symmetric bilinear form is positive-definite; a variant of the lower bound (2.18) could also be proved by first considering and then using continuity. For the interpretation of , see (1.3).
Lemma 2.10**.**
If and with , then , i.e.,
[TABLE]
In particular, .
Proof.
We first consider only , so Lemma 2.8(ii) applies. Then the estimate (2.17) can be written
[TABLE]
A Taylor expansion yields
[TABLE]
Since is real for real and , all derivatives are real. Hence, when taking the real part, the linear term vanishes, and (2.19) implies
[TABLE]
Exploiting bilinearity, by replacing with and letting , we now obtain (2.18) for all , with room to spare.
Finally, by (1.4), note that (2.18) can be written . This says that both eigenvalues are , and thus the determinant is . ∎
For , define
[TABLE]
We are now ready to estimate the integral (2.14) for using a (two-dimensional) version of the saddle point method (see, e.g., [9, Chapter VIII]). We defer the problem of finding suitable satisfying equation (2.21) to Section 2.3. Recall that \tilde{f}_{0}(y,z)=\frac{\partial}{\partial y}f_{Y^{0},Z^{0}}(y,z)=\operatorname{\mathbb{E}{}}\bigl{(}Y^{0}e^{yY^{0}+zZ^{0}}\bigr{)}, see (2.13).
Theorem 2.11**.**
Suppose that and . Suppose further that , are integers and that , are real numbers with such that
[TABLE]
Then
[TABLE]
where the implicit constant depends only on the parameters of and .
Proof.
We write (2.14) as
[TABLE]
where
[TABLE]
Using assumption (2.21) we have , so
[TABLE]
We shall estimate (2.24) using Laplace’s method (in two dimensions), cf. e.g. [9, Appendix B.6]. Roughly speaking, the idea is as follows. We view the integrand as a product of a term independent of with a term that is exponential in . As we shall see, the condition (2.21) ensures that the exponent has a stationary point, in fact a maximum, at . It turns out that the main contribution is near to this point, and here the exponent may be approximated by a quadratic, leading to a (two-dimensional) Gaussian integral.
Applying Lemma 2.8(i) to shows that . Since \operatorname{Det}\bigl{(}D^{2}\varphi(\alpha,\beta)\bigr{)}=\Omega(1) by Lemma 2.10, and by (2.20) and Lemma 2.8(ii), the conclusion (2.22) holds for any fixed simply by taking the implicit constant large enough. Thus we may assume that is at least any given constant , and in particular that .
Applying Lemma 2.8(i) to also shows that . Hence, if or , then by Lemma 2.9 the integrand in (2.24) is O\bigl{(}e^{-c_{3}n\cdot n^{-0.8}}\bigr{)}=O\bigl{(}e^{-c_{3}n^{0.2}}\bigr{)}=O\bigl{(}n^{-99}\bigr{)}. On the other hand, if , then, since , Lemma 2.8(ii) shows that is defined and we obtain
[TABLE]
with
[TABLE]
Considering a Taylor expansion of around , and noting that the linear terms cancel by our assumption (2.21), we have
[TABLE]
where we used Lemma 2.8(ii) to bound the error term. For , , note that Lemma 2.8(ii) implies , and . Hence, writing for brevity
[TABLE]
the exponential factor in (2.27) is
[TABLE]
Recalling , using Lemma 2.8(i) we also have the Taylor expansion
[TABLE]
Multiplying together (2.29) and (2.30), the integrand in (2.27) is thus
[TABLE]
When we integrate, the terms with and are odd functions of so their integrals vanish. Hence,
[TABLE]
Recalling that , by Lemma 2.10 we have . Since for we have , it follows that
[TABLE]
Since is symmetric and positive-definite by Lemma 2.10, we have the following standard Gaussian integral over :
[TABLE]
Since , the contribution of the range to the above integral (2.32) is again exponentially small. Hence
[TABLE]
The result follows by combining (2.23), (2.25), (2.26) and (2.33). ∎
We next estimate the exponent in (2.22), without assuming that equation (2.21) holds.
Lemma 2.12**.**
There exists a constant such that if and with , then
[TABLE]
Moreover, , and .
Proof.
We have . Furthermore, differentiating (2.20) yields
[TABLE]
and thus . Differentiating again shows that for all . Hence, using Lemma 2.10,
[TABLE]
Moreover, it follows from Lemma 2.8(ii) that for . Consequently, a Taylor expansion yields (2.34) for sufficiently small. ∎
2.3 Summing : proof of Theorem 2.2
In this section we prove Theorem 2.2 by summing several different estimates of the point probabilities in
[TABLE]
Throughout we consider, as in (2.22), only real inputs , for the various functions , etc. Thus, all relevant functions are treated as mapping from (subdomains in) to for suitable , .
An individual of type has on average children of type and children of type . So, in the near-critical case , we expect that the overall fraction of type individuals in should be close to
[TABLE]
This suggests that the contribution from terms in (2.35) with far from will be negligible, and we shall later confirm this by standard Chernoff-like estimates. Below our main focus is thus on the terms where is close to . Here the plan is to rewrite the asymptotic estimate (2.22) for using the following version of the inverse function theorem, where we explicitly state uniformity for a set of functions. We define
[TABLE]
Lemma 2.13** (Inverse function theorem).**
Let be an integer and a real number. For every , there exist and , both depending only on , such that if is twice continuously differentiable and satisfies
- (i)
, 2. (ii)
* is invertible and , and* 3. (iii)
* for all ,*
then there exists a twice continuously differentiable function with and for . Furthermore, for each , is the unique with such that . Moreover, and , uniformly for and all such , and if is infinitely differentiable or (real) analytic, then so is .
Proof.
This follows by a standard proof of the inverse function theorem; we give some details for completeness.
First, let . If , then by the mean-value theorem . Hence, , and thus is invertible and its inverse has norm at most (e.g., by the von Neumann series representation of the inverse). Consequently, is invertible and
[TABLE]
Next, let . If , define inductively and , where
[TABLE]
Using and that if , it is easy to show by induction that and . Hence is defined for all , and converges to some with . Furthermore, as , and thus by continuity . Define .
This shows that the inverse function exists in . The uniqueness statement is immediate, since any satisfying is a fixed point of , which is a contraction for . Differentiability (and analyticity when is analytic) follows in the usual way (or by appealing to a standard version of the inverse function theorem, locally at ). Finally, , and thus by (2.36). Another differentiation (using the chain rule) then yields . ∎
Our next aim is to construct an (implicit) solution to equation (2.21) when and is close to . We start by applying Lemma 2.13 to the function defined by
[TABLE]
Note that , and thus is well-defined. Furthermore, , and thus . Moreover, using matrix form (where the first column is of the vector valued function and the second is ), we have
[TABLE]
It follows from Lemma 2.10 that \bigl{\|}(D^{2}\varphi(\alpha,\beta))^{-1}\bigr{\|}=O(1), and then (2.38) together with Lemma 2.8 yields
[TABLE]
Lemma 2.8 also implies . Consequently, Lemma 2.13 applies (with ) and yields a constant and a function such that
[TABLE]
Recall that . Since by Lemma 2.8 and by (2.4), there exists a constant such that . Let
[TABLE]
Suppose that . If also , then \bigl{(}\operatorname{\mathbb{E}{}}Y-1,x-x_{0}\bigr{)}\in B_{c_{5}}; we then define
[TABLE]
Furthermore, implies and . Now suppose that and that , and let and (\alpha,\beta):=h\bigl{(}n/N\bigr{)}. Then, by (2.41) and (2.40),
[TABLE]
Definition (2.37) shows that (2.21) holds. Hence, by Theorem 2.11, (2.22) holds. For define
[TABLE]
Recall that , and note that Lemma 2.10 implies \operatorname{Det}\bigl{(}D^{2}\varphi(h(x))\bigr{)}\geqslant c_{3}^{2}; thus and are well-defined. Then, still assuming , and (\alpha,\beta):=h\bigl{(}n/N\bigr{)}, we see that (2.22) can be written
[TABLE]
(Here, we use to bound , so an error term is .)
We next show that, in the relevant domains, the functions , and their (partial) derivatives are all bounded.
Lemma 2.14**.**
For each , there exists a constant such that if and , then and .
Proof.
We saw in the proof of Lemma 2.13 that DG(y)=\bigl{(}DF(G(y)\bigr{)}^{-1}, which is bounded for by (2.36). By further differentiations, using the chain rule, Lemma 2.8(ii) and induction, it follows that for each ,
[TABLE]
when and . Hence the definition (2.41) yields , and the result follows by (2.43)–(2.44) together with the chain rule and Lemmas 2.8 and 2.10. ∎
Note for later than since and in , we have
[TABLE]
if .
We now analyze the exponential term of the formula (2.45) for , which is valid for . The next result in particular implies that is a concave function with a unique maximizer close to . As we shall see, this essentially means that the dominant contribution to the sum of the comes from the terms with close to , which is in turn close to .
Lemma 2.15**.**
There exist constants with such that if , then the following hold.
- (i)
If with , then
[TABLE] 2. (ii)
There exists with and such that . 3. (iii)
* for every with .* 4. (iv)
.
As a consequence, is the unique maximum point of in .
Proof.
For , let
[TABLE]
so that . In the proofs below we assume that and are positive constants, chosen later, with , and that and .
(i): Since , and maps into , we have
[TABLE]
Since and in , using we also have . This and Lemma 2.12 imply
[TABLE]
Furthermore, as remarked above, implies . Hence, recalling (2.49),
[TABLE]
which yields (2.48) since .
(iii): Using , which is shorthand for , we have
[TABLE]
Together with (2.51), it follows that, for some constant ,
[TABLE]
The same proof as for Lemma 2.14 shows that
[TABLE]
for every fixed . Using (2.55) with and (2.54), we see that if and hence is small enough, then
[TABLE]
when and . In particular, recalling , by taking we have
[TABLE]
(ii): Similarly, (2.53) and (2.55) with imply that . In particular, . Hence we may choose sufficiently small such that implies . Then the mean value theorem and (2.56) imply and , so for some . Moreover, by the mean value theorem and (2.56) we also have , so (ii) holds.
(iv): Since , by (2.50) and the definition (2.41) of we have , so Lemma 2.8(iii) applied to gives . The other factors in (2.44) are bounded below, using and Hadamard’s inequality together with Lemma 2.8(ii), and thus (iv) follows. ∎
The following technical lemma will be useful for expanding the sum of the estimates (2.45) around (it is easy to give a much more precise formula for , but we do not need this).
Lemma 2.16**.**
For , and an integer , let
[TABLE]
Then, uniformly for all and ,
[TABLE]
and for every fixed integer ,
[TABLE]
Proof.
We first consider . Applying the well-known Poisson summation formula [26, (II.13.4) or (II.13.14)] and then using the Gaussian integral , a short standard calculation yields the identity
[TABLE]
which for , say, implies (2.58). (In fact, (2.61) is equivalent to a well-known identity for the theta function , see [16, (20.7.32)].)
Moreover, taking the partial derivative of (2.57) with respect to we obtain
[TABLE]
In particular, , and termwise differentiation of the right-hand side in (2.61) (noting that the main term, , is constant) yields
[TABLE]
Repeated differentiation of (2.62) and induction now yield (2.59) and (2.60). ∎
We also have to estimate the sum of the in (2.35) where is far from . Based on simple Chernoff-type arguments, the next result shows that their contribution is negligible.
Lemma 2.17**.**
If , then .
Proof.
For any , from (2.12) we have
[TABLE]
Take and , with , and define
[TABLE]
For any , (2.63) yields
[TABLE]
Note that and . Since and , by Lemma 2.8(i) there is a constant such that whenever , and so
[TABLE]
By assumption, . Recalling that and , it follows that
[TABLE]
We now choose where , and the sign is such that . Using (2.64)–(2.66) and , we infer
[TABLE]
completing the proof for .
Finally, in the remaining case we have , since if and only if . Hence
[TABLE]
completing the proof (since ). ∎
We are now ready to prove Theorem 2.2.
Proof of Theorem 2.2.
We suppose throughout that and that .
We start by considering the quantities and defined in (2.6) and (2.7). By Lemma 2.15, has a local maximum point . As in (2.6) and (2.7), let
[TABLE]
By Lemmas 2.14 and 2.15(iii), . By (2.48) we have . Recalling that , see (2.49), by combining (2.52), (2.53) and (2.55) (with ) together with Lemma 2.15(ii), it follows that
[TABLE]
Hence \xi=\Theta\bigl{(}|\operatorname{\mathbb{E}{}}Y-1|^{2}\bigr{)}, as claimed. That follows from the bound above and Lemmas 2.14 and 2.15(iv), which give .
Since and , which do not depend on , are both , for any fixed , (2.5) holds trivially simply by taking the implicit constant large enough. Thus we may assume throughout that .
We have . We estimate this sum by Laplace’s method, similarly to the argument in the proof of Theorem 2.11, but now for a sum instead of a two-dimensional integral.
We consider first such that , which includes the main terms in the sum. Suppose that . Using Lemma 2.14, a Taylor expansion then yields, cf. (2.28),
[TABLE]
which by exponentiation and a Taylor expansion of yields, cf. (2.31),
[TABLE]
Similar, but simpler, reasoning also shows that if , then
[TABLE]
Consequently, since , if we define
[TABLE]
then (2.45) yields
[TABLE]
(The odd sums and do not vanish as the corresponding integrals in the proof of Theorem 2.11 do, but we shall see that they are exponentially small.) Recall (from the start of the proof) that . It follows that if we extend the summation in the definition (2.68) to all , and denote the result by , then is O\bigl{(}e^{-\Omega(N^{0.2})}\bigr{)} for each fixed . Let . In the notation of Lemma 2.16, . The error terms of the form in the conclusion of Lemma 2.16 are and so negligible. Thus, from Lemma 2.16 and (2.69), recalling the definitions (2.6) and (2.7) of and , we find
[TABLE]
Next, consider such that , and recall that . If , then Lemma 2.15 implies that and . Hence, by (2.45) and Lemma 2.14, if , then Lemma 2.15 implies that
[TABLE]
The sum over such is easily absorbed into the error term we are aiming for: we have, say,
[TABLE]
Finally, since by Lemma 2.15(ii) and , using Lemma 2.17 there exists a constant such that, say,
[TABLE]
Recalling that , by (2.67) we may choose sufficiently small so that , and then (2.5) follows from (2.70), (2.71) and (2.72). ∎
3 Application to branching process families
In this section we apply the main result of Section 2 (Theorem 2.2) to a family of branching processes. The goal is to prove Theorem 3.4 below, giving estimates for the point probabilities in a form suitable for the application to Achlioptas processes in [22].
3.1 Properties of general parameterized families
By a branching process family we simply mean a family of branching processes of the type in Definition 1.1, one for each in some interval . Given such a family, we write
[TABLE]
for the corresponding probability generating functions. Note that the branching process family is fully specified by the interval and the functions and .
The following auxiliary result shows that the associated parameters and defined as in Theorem 2.2 vary smoothly in . This will later allow us to compare the parameters and resulting from different probability distributions and (by integrating linear mixtures that interpolate between them); here the extra factor in (3.2) is crucial.
Lemma 3.1**.**
Suppose that , , , and . Set and . Let be a branching process family such that, for every , we have , , and , where is the constant appearing in Theorem 2.2. Suppose that and are analytic as functions of in the domain
[TABLE]
and that for some ,
[TABLE]
for all . Let
[TABLE]
be defined as in Theorem 2.2. Then and are (real) analytic as functions of . Furthermore,
[TABLE]
where the implicit constants in (3.2) and (3.3) depend only on .
Proof.
By assumption, the conditions of Theorem 2.2 hold for each . For any of the quantities or functions defined in previous sections for a single branching process, we use a subscript to denote the corresponding quantity or function associated to . As in previous sections, and always denote real numbers.
The idea of the proof is as follows. For a given , the functions defined in the previous sections are defined, either explicitly or implicitly, in terms of and (or their reparameterizations and ). Roughly speaking, since and vary analytically in by assumption (and with -derivative ), it follows that the same is true for the derived quantities. There are various steps where we must be slightly careful; for example, when taking logs (there is no problem as we stick to the domain ), or dividing by the square root of a certain determinant (there is no problem since this determinant is by Lemma 2.10). We must also be careful with the implicit definitions of and ; the hardest part of the argument is to establish (3.2) with instead of .
Turning to the details, from (3.1) and standard Cauchy estimates we see that for each fixed we have
[TABLE]
whenever , say. (Here and below, does not include derivatives with respect to .) Since , the same estimates hold for the derivatives of and in the domain ; from now on we work over the reals. Recalling the definition (2.37) and , from (3.4) it follows that for .
From the definition (2.37), the function is a (real) analytic function of . For each , by (2.40) we have an inverse of the 2-variable function . Applying a standard version of the implicit function theorem locally, we see that is analytic as a function of .111Fix and in the relevant domain. By (2.36), is invertible at , so there is an analytic function defined in a neighbourhood of such that . By local uniqueness, near , so is indeed analytic at this point.
Noting \operatorname{\mathbb{E}{}}Y_{u}=\frac{\partial}{\partial y}g_{u}(y,z)\big{|}_{y=z=1}, by definition (2.41) and it follows that is an analytic function of for and ; we consider in the sequel only such and . Inspecting the definitions (2.43) and (2.44), using Lemma 2.10 (to ensure that the determinant is not degenerate), we see that and are well-defined compositions of analytic functions, and thus analytic as functions of .
Since is independent of , writing and differentiating yields and thus, recalling (2.39), for ,
[TABLE]
Recalling the definition (2.41), note that (3.5) implies . Since is defined in terms of and its derivatives, see (2.20), using it follows that . Furthermore, since estimates analogous to (3.4) also hold for , we have . Hence, recalling (2.43) and writing , we have
[TABLE]
Recalling the definitions (2.43) and (2.44), and the estimates in Section 2.3, we similarly deduce , and .
Since is defined by , we have . It follows, using the lower bound from Lemma 2.15(iii) and the implicit function theorem, that is an analytic function of , and that
[TABLE]
As (3.5) implies , and by (2.46), it follows that
[TABLE]
Similarly, from the definitions (2.6) and (2.7), using (3.7) and the bounds above on , and it follows that and are analytic functions of , with and . It remains only to establish (3.2).
For this final step, recalling (2.20) and , note that follows from (3.4). Since , we thus have \frac{\partial}{\partial u}\psi_{u}(x)=O\bigl{(}\lambda|x|\bigr{)}. Similarly, as a consequence of Lemma 2.12, \psi_{u}(x)=O\bigl{(}|x|\bigr{)} and \|D\psi_{u}(x)\|=O\bigl{(}|x|\bigr{)}. Writing for , it follows from the definition (2.43) and (3.7)–(3.8) that
[TABLE]
Since , from the definition (2.41) of , the bound (2.47), and, for the final step, Lemma 2.15(ii), it follows that
[TABLE]
completing the proof of the lemma. ∎
3.2 A specific result suitable for application to Achlioptas processes
In this section we use Theorem 2.2 and Lemma 3.1 to prove the case , of Theorem A.10 of [22], used there for the analysis of Achlioptas processes. To formulate this main application, i.e., our point probability result for certain (perturbed) branching process families, we need some some further definitions.
Definition 3.2**.**
Let be real numbers. The branching process family is -critical if the following hold:
- (i)
There exist and with such that the probability generating functions
[TABLE]
are defined and analytic on the domain
[TABLE] 2. (ii)
We have
[TABLE] 3. (iii)
There exists some such that
[TABLE]
Definition 3.3**.**
Let be a -critical branching process family, and let , and be as in Definition 3.2. Given with , we say that the branching process is of type (with respect to , , , and ) if the following hold:
- (i)
Writing , the expectations
[TABLE]
are defined (i.e., the expectations converge absolutely) for all . 2. (ii)
For all we have
[TABLE]
Note that is itself of type for any . The following result relates the point probabilities from with those from branching processes of type . A key feature is the form of the uniform error term in (3.15). In (3.14) and (3.15) below, we have and for (using ), and and for any branching process of type with . In the near-critical case , the size– point probabilities of and thus both decay exponentially in .
Theorem 3.4** (Point probabilities of of type ).**
Let be a -critical branching process family. Then there exist constants and analytic functions , on the interval such that
[TABLE]
uniformly over all , , and all branching processes of type (with respect to ), where the parameters and , which depend on the distributions of and of , satisfy
[TABLE]
Moreover, , , , and .
Proof.
Fix a -critical branching process family , and let and be as in the definitions above. We pick , and decrease slightly, keeping . Then and are continuous on the compact domain , and so bounded, say by . Let . Then, provided , by (3.13) any of type with satisfies
[TABLE]
For any integers , we have
[TABLE]
Since is analytic in , this probability varies continuously in . Moreover, since can analogously be written as a derivative of evaluated at , using standard Cauchy estimates and (3.13) we infer
[TABLE]
A similar argument shows that \operatorname{\mathbb{E}{}}Y_{t}=\frac{\partial}{\partial y}g_{t}(y,z)\big{|}_{y=z=1} is continuous in , and that Cauchy’s estimates imply
[TABLE]
Analogous reasoning applies to and .
By definition of a -critical branching process family, there is some such that for all of , , and are at least , say. Furthermore, at we have . From the argument above these quantities all vary continuously in , and change by when we move from to some of type . It follows that there is a constant such that, after reducing if necessary, whenever and , then any of type satisfies the conditions of Theorem 2.2, namely that , and .
Now, applying Theorem 2.2 to each branching process in the family , and Lemma 3.1 to the family itself, establishes the case of Theorem 3.4 with and . Indeed, Theorem 2.2 gives that , so we do have , while (2.8) gives , which is since (3.10) implies, after reducing if necessary, that
[TABLE]
It follows that and .
To complete the proof, assume now that is of type , with and . As noted above, Theorem 2.2 applies to , giving (3.14); it remains to establish (3.15). We do this by interpolating between and , and applying Lemma 3.1. Consider the branching process family defined by the mixtures
[TABLE]
(As noted earlier, the probability generating functions , and the interval fully specify the family.) Since the assumptions of Theorem 2.2 are preserved by taking mixtures, every branching process in this family satisfies these assumptions. (In fact, they are all clearly of type too.) Moreover, the assumption (3.13) implies that (3.1) holds with , and since \operatorname{\mathbb{E}{}}\bar{Y}_{u}=\frac{\partial}{\partial y}\bar{g}_{u}(y,z)\big{|}_{y=z=1} we have
[TABLE]
by (3.16) and (3.17). Thus we may apply Lemma 3.1, and, by integrating (3.2) with , we infer
[TABLE]
Finally, follows similarly by integrating (3.3). ∎
Theorem 3.4 immediately implies the key case , of Theorem A.10 of [22] with any positive value of the constant . Indeed, after reducing if necessary, the assumption in the latter theorem implies the assumption of Theorem 3.4. Moreover, the same assumption together with (3.15) implies the bound in Theorem A.10 of [22].
4 The survival probability
In this section we study the survival probability of the branching process from Definition 1.1 and the branching process family from Section 3. The goal is to prove Theorem 4.5 below, i.e., to give estimates for suitable for the application to Achlioptas processes in [22].
Our strategy mimics the general approach used in Sections 2–3 for point probabilities, though the technical details are much simpler. In Section 4.1 we first prove a technical result for the survival probability of a single branching process (Lemma 4.2). Then we show that in a branching process family certain parameters related to the survival probability vary smoothly in (Lemma 4.4). Finally, in Section 4.2 we combine these two auxiliary results to prove Theorem 4.5.
4.1 Properties of a single process and general parameterized families
As far as the survival of is concerned, particles of type are irrelevant and may be ignored, so we may consider a standard single-type Galton–Watson branching process with offspring distribution and initial distribution , which we henceforth denote by . Thus
[TABLE]
Writing as shorthand for the distribution with constant value one, it similarly follows that
[TABLE]
Throughout this section, we shall work with the univariate probability generating functions and . By standard branching process arguments (see, e.g., [10, Theorem 5.4.5]), we have
[TABLE]
where the extinction probability is the smallest non-negative solution to
[TABLE]
Fix , , , and . We henceforth assume that and . Since , by (2.1) the function is analytic in , with . A Taylor expansion of at yields, for ,
[TABLE]
Define
[TABLE]
removing the removable singularity at . Then is analytic in , and
[TABLE]
Observe that if , then (4.4) is equivalent to . Furthermore,
[TABLE]
We next derive bounds on the derivatives of valid for small .
Lemma 4.1**.**
Suppose that , , , and . There exist constants and such that if , then the following hold.
- (i)
If and , then . 2. (ii)
If , then and . 3. (iii)
If and , then .
Proof.
(i): By (4.6) and (2.1), if , say. Hence the result, with , say, follows by Cauchy’s estimates.
(ii): If (2.3) holds with , then , and thus .
If instead (2.3) holds with , then . Since , then implies , and thus .
In both cases, follows by (4.8), and holds, too.
(iii): Follows by (ii) and (i) (with ), replacing by . ∎
We next characterize the survival probability in terms of the (unique) solution to .
Lemma 4.2**.**
Suppose that , , , and . There exists a constant such that the following holds. If and , then there is a unique such that
[TABLE]
Furthermore, , , and , where the implicit constants depend only on and .
Proof.
We apply the inverse function theorem, Lemma 2.13, with , and
[TABLE]
using (4.8) and Lemma 4.1 to verify the assumptions; we shall ensure that , so implies . Writing to avoid clutter (as before), Lemma 2.13 shows the existence of a constant , which we may assume to be at most , and an inverse function with and . We define
[TABLE]
so that . Since in by Lemma 2.13 and in by Lemma 4.1(i), using we have and , establishing .
We relate and by a variant of the usual fixed point analysis of in . Since by Lemma 4.1(ii), is strictly convex on , which implies that has at most two solutions in this interval, and exactly one solution if , since and . Now and are solutions. Since , is also a solution (see (4.6)); since , we have .
If , then and are two distinct solutions; thus , and by strict convexity. Similarly, if , then and thus , and by strict convexity. Finally, if , then by (4.8), so that (since then is the only solution to in ). Hence in all cases. It follows also that is unique, and that has the same sign as . ∎
Remark 4.3**.**
Since , when it follows easily that . In particular as , assuming, as always here, that . This holds under much weaker conditions on , see [12] and [2] for precise conditions; see also [11, Section 3].
We next consider a branching process family as in Section 3; as there we indicate the parameter by subscripts. Thus, for example, is defined as in Lemma 4.2, with replaced by . Furthermore, in analogy to (4.3), we also define
[TABLE]
Thus, by combining (4.3) with Lemma 4.2, when we have and . Mimicking Lemma 3.1, the following auxiliary result shows that and both vary smoothly in .
Lemma 4.4**.**
Suppose that , , , and . Set and . Let be a branching process family satisfying the assumptions of Lemma 3.1, with replaced by . Let and be defined as in Lemma 4.2 and (4.9). Then and are analytic functions of . Furthermore,
[TABLE]
where the implicit constants depend only on and .
Proof.
Let be the equivalent of (4.6) for , again removing the removable singularity at . Then is an analytic function of . Note that (3.1) implies if , say. Since , by the maximum modulus principle (applied with fixed) it follows that
[TABLE]
for all and .
By Lemma 4.2, for every there is a unique with such that
[TABLE]
Since by Lemma 4.1(iii) and , the implicit function theorem shows that is an analytic function of . That is analytic then follows from (4.9) and the assumption that is analytic. By differentiating (4.12) we obtain . So, using and (4.11),
[TABLE]
Finally, follows from (2.1) and Cauchy’s estimates (recall that ). By differentiating (4.9) and then using (3.1) and (4.13), we obtain
[TABLE]
completing the proof. ∎
4.2 A specific result suitable for application to Achlioptas processes
We are now ready to prove our main result, concerning the -dependence of the survival probability of when is a -critical branching process family, as well as the survival probability of branching processes of type ; see Section 3.2 for the relevant definitions. Two key features are the convergent power series expansion (4.14), and the uniform error term in (4.15). In particular, we have for any branching process of type with . In the supercritical case , the survival probabilities of and thus both grow linearly in .
Theorem 4.5** (Survival probabilities).**
Let be a -critical branching process family. Then there exist constants with the following properties. Firstly, the survival probability is zero for , and is positive for . Secondly, is analytic on . More precisely, there are constants with such that
[TABLE]
for . Thirdly, for any , with and , and any branching process of type (with respect to ), the survival probability is zero if , and is positive and satisfies
[TABLE]
if , where the implicit constant depends only on the family , not on or . Moreover, analogous statements hold for the survival probabilities and .
Proof.
We argue as in the proof of Theorem 3.4. In particular, we may assume that and for some . We shall also assume that .
We consider only with ; we may assume that is small enough that this implies , and, by (3.10), that , and that
[TABLE]
By (4.2) and Lemma 4.2 and it follows that is zero for , and positive for . Since , now (4.1) and (4.3) imply an analogous statement for . Lemmas 4.2 and 4.4 also imply that
[TABLE]
are both analytic for . Hence (4.14) holds if is sufficiently small.
Next, for a branching process of type , by (3.16) we have . Since , it follows from (4.16) that if is small enough, then . Moreover, since , using (4.16) we also have if is small enough. Mimicking the above reasoning for and , using (4.1)–(4.3) and Lemma 4.2 it follows for that and satisfy if , and if ; furthermore,
[TABLE]
for and .
Finally, we consider the interpolating branching process family defined by (3.18), for which, as noted in Section 3.2, (3.1) holds with and . Note that (3.19) and imply provided is small enough. Integrating (4.10) of Lemma 4.4 over similarly to (3.20) in the proof of Theorem 3.4, using the identities (4.17)–(4.18) we infer and for and , completing the proof. ∎
Theorem 4.5 immediately implies the key case , of Theorem A.11 of [22], used there for the analysis of Achlioptas processes.
Acknowledgement. The last two authors are grateful to Christina Goldschmidt for useful pointers to the local limit theorem literature, which were helpful for the developing parts of the slightly more involved (large deviation based) point probability analysis contained in an earlier version of [22].
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] D. Achlioptas, R.M. D’Souza, and J. Spencer. Explosive percolation in random networks. Science 323 (2009), 1453–1455.
- 2Athreya [1992] K.B. Athreya. Rates of decay for the survival probability of a mutant gene. J. Math. Biol. 30 (1992), 577–581.
- 3[3] S. Bhamidi, A. Budhiraja, and X. Wang. The augmented multiplicative coalescent, bounded size rules and critical dynamics of random graphs. Probab. Theory Related Fields 160 (2014), 733–796.
- 4[4] T. Bohman and D. Kravitz. Creating a giant component. Combin. Probab. Comput. 15 (2006), 489–511.
- 5Chaumont and Liu [2016] L. Chaumont and R. Liu. Coding multitype forests: Application to the law of the total population of branching forests. Trans. Amer. Math. Soc. 368 (2016), 2723–2747.
- 6Cramér [1938] H. Cramér. Sur un nouveau théorème-limite de la théorie des probabilités. Actualités Scientifiques et Industrielles 736 (1938), 5–23.
- 7[7] M. Drmota and V. Vatutin. Limiting distributions in branching processes with two types of particles. In Classical and modern branching processes (Minneapolis, MN, 1994) , IMA Vol. Math. Appl., 84, Springer, New York (1997), 89–110.
- 8Dwass [1969] M. Dwass. The total progeny in a branching process and a related random walk. J. Appl. Probab. 6 (1969), 682–686.
