A central limit theorem for the gossip process

A. D. Barbour; A. R\"ollin

arXiv:1706.05483·math.PR·November 26, 2018

A central limit theorem for the gossip process

A. D. Barbour, A. R\"ollin

PDF

TL;DR

This paper proves a central limit theorem for the Aldous gossip process, showing that the random time shift in information dissemination is approximately normally distributed, with computable mean and variance.

Contribution

It extends the understanding of the gossip process by establishing a normal approximation for the initial stochastic delay, enhancing predictive accuracy.

Findings

01

The random time shift follows an approximately normal distribution.

02

The mean and variance of the time shift can be explicitly computed.

03

The broad deterministic description remains valid with increased precision.

Abstract

The Aldous gossip process represents the dissemination of information in geographical space as a process of locally deterministic spread, augmented by random long range transmissions. Starting from a single initially informed individual, the proportion of individuals informed follows an almost deterministic path, but for a random time shift, caused by the stochastic behaviour in the very early stages of development. In this paper, it is shown that, even with the extra information available after a substantial development time, this broad description remains accurate to first order. However, the precision of the prediction is now much greater, and the random time shift is shown to have an approximately normal distribution, with mean and variance that can be computed from the current state of the process.

Equations516

\frac{L _{t_{L} + u / λ}}{L} \approx ℓ (u + U) for any u \in R

\frac{L _{t_{L} + u / λ}}{L} \approx ℓ (u + U) for any u \in R

L_{t_{L} + u / λ} / L = P [K \in L_{t_{L} + u / λ} ∣ L_{t_{L} + u / λ}] .

L_{t_{L} + u / λ} / L = P [K \in L_{t_{L} + u / λ} ∣ L_{t_{L} + u / λ}] .

{\mathbb{E}}_{v}\bigl{|}(L_{t_{L}+u/\lambda}/L)-{\mathbb{E}}\{L_{t_{L}+u/\lambda}/L\,|\,{\cal L}_{s}\}\bigr{|}\ \ll\ {\rm SD}_{v}(L_{t_{L}+u/\lambda}/L),

{\mathbb{E}}_{v}\bigl{|}(L_{t_{L}+u/\lambda}/L)-{\mathbb{E}}\{L_{t_{L}+u/\lambda}/L\,|\,{\cal L}_{s}\}\bigr{|}\ \ll\ {\rm SD}_{v}(L_{t_{L}+u/\lambda}/L),

\sigma^{-1}\bigl{(}L_{t_{L}+u/\lambda}/L-\ell(\log[CW(v,v)]+u)\bigr{)}\ \approx_{d}\ {\cal N}(0,1),

\sigma^{-1}\bigl{(}L_{t_{L}+u/\lambda}/L-\ell(\log[CW(v,v)]+u)\bigr{)}\ \approx_{d}\ {\cal N}(0,1),

L_{t} := j : τ_{j} \leq t ⋃ K (P_{j}, t - τ_{j}) and L_{t} := ∣ L_{t} ∣.

L_{t} := j : τ_{j} \leq t ⋃ K (P_{j}, t - τ_{j}) and L_{t} := ∣ L_{t} ∣.

\bigg{|}\frac{{\nu}_{s}}{s^{d}\,{\nu}}-1\bigg{|}\ \leq\ c_{g}\bigg{(}\frac{s^{d}\,{\nu}}{L}\bigg{)}^{\gamma_{g}/d},\quad s>0.

\bigg{|}\frac{{\nu}_{s}}{s^{d}\,{\nu}}-1\bigg{|}\ \leq\ c_{g}\bigg{(}\frac{s^{d}\,{\nu}}{L}\bigg{)}^{\gamma_{g}/d},\quad s>0.

\frac{{\nu}_{s}}{s^{d}\,{\nu}}-1\ =\ \frac{dR^{d}}{s^{d}}\,\int_{0}^{s/R}(\sin t)^{d-1}\,dt-1\ =\ O\bigl{(}(s/R)^{2}\bigr{)},

\frac{{\nu}_{s}}{s^{d}\,{\nu}}-1\ =\ \frac{dR^{d}}{s^{d}}\,\int_{0}^{s/R}(\sin t)^{d-1}\,dt-1\ =\ O\bigl{(}(s/R)^{2}\bigr{)},

\exp\Bigl{\{}-\int_{0}^{u}\rho{\nu}_{s}\,ds\Bigr{\}}\ \approx\ \exp\Bigl{\{}-\int_{0}^{u}\rho s^{d}{\nu}\,ds\Bigr{\}}\ =\ \exp\bigl{\{}-\rho{\nu}u^{d+1}/(d+1)\bigr{\}},

\exp\Bigl{\{}-\int_{0}^{u}\rho{\nu}_{s}\,ds\Bigr{\}}\ \approx\ \exp\Bigl{\{}-\int_{0}^{u}\rho s^{d}{\nu}\,ds\Bigr{\}}\ =\ \exp\bigl{\{}-\rho{\nu}u^{d+1}/(d+1)\bigr{\}},

\int_{0}^{\infty}\exp\bigl{\{}-\rho{\nu}u^{d+1}/(d+1)\bigr{\}}\,du\ =\ (\rho{\nu})^{-1/(d+1)}\int_{0}^{\infty}e^{-w^{d+1}/(d+1)}\,dw.

\int_{0}^{\infty}\exp\bigl{\{}-\rho{\nu}u^{d+1}/(d+1)\bigr{\}}\,du\ =\ (\rho{\nu})^{-1/(d+1)}\int_{0}^{\infty}e^{-w^{d+1}/(d+1)}\,dw.

λ := (ρ d! ν)^{1/ (d + 1)},

λ := (ρ d! ν)^{1/ (d + 1)},

Λ := L λ^{d} / ν,

Λ := L λ^{d} / ν,

L_{t} / L \sim K Λ^{- 1} e^{λ t + l o g W}, t \to \infty,

L_{t} / L \sim K Λ^{- 1} e^{λ t + l o g W}, t \to \infty,

t = t_{Λ} (u) := λ^{- 1} (lo g Λ + u),

t = t_{Λ} (u) := λ^{- 1} (lo g Λ + u),

u \mapsto ℓ_{0} (u),

u \mapsto ℓ_{0} (u),

L_{t_{Λ} (u - l o g W)} / L \approx ℓ (u + lo g \overset{c}{^}_{d})

L_{t_{Λ} (u - l o g W)} / L \approx ℓ (u + lo g \overset{c}{^}_{d})

W (v) := e^{- λ v} l = 0 \sum d j \in J_{v} \sum \frac{{ λ ( v - τ _{j} ) } ^{l}}{l !}

W (v) := e^{- λ v} l = 0 \sum d j \in J_{v} \sum \frac{{ λ ( v - τ _{j} ) } ^{l}}{l !}

ζ (d) := {1/2 1 - cos (2 π / d) if d \leq 6, if d \geq 7,

ζ (d) := {1/2 1 - cos (2 π / d) if d \leq 6, if d \geq 7,

ℓ (u) := 1 - ϕ_{\infty} (e^{u}), \mbox w h er e ϕ_{\infty} (θ) := E {e^{- θ W}},

ℓ (u) := 1 - ϕ_{\infty} (e^{u}), \mbox w h er e ϕ_{\infty} (θ) := E {e^{- θ W}},

d_{{\rm BW}}(P,Q)\ :=\ \sup_{f\in F_{{\rm BW}}}\Bigl{\{}\Bigl{|}\int f\,dP-\int f\,dQ\Bigr{|}\Bigr{\}},

d_{{\rm BW}}(P,Q)\ :=\ \sup_{f\in F_{{\rm BW}}}\Bigl{\{}\Bigl{|}\int f\,dP-\int f\,dQ\Bigr{|}\Bigr{\}},

\displaystyle d_{{\rm BW}}\bigl{(}{\cal L}\bigl{\{}e^{\lambda v/2}\{L_{t_{\Lambda}(u)}/L-\ell(u+\log[{\hat{c}}_{d}{\widehat{W}}(v)])\}\,\big{|}\,{\cal F}_{v}\cap E^{*}(v)\bigr{\}},{\cal N}(0,\sigma^{2}(u,{\widehat{W}}(v)))\bigr{)}

\displaystyle d_{{\rm BW}}\bigl{(}{\cal L}\bigl{\{}e^{\lambda v/2}\{L_{t_{\Lambda}(u)}/L-\ell(u+\log[{\hat{c}}_{d}{\widehat{W}}(v)])\}\,\big{|}\,{\cal F}_{v}\cap E^{*}(v)\bigr{\}},{\cal N}(0,\sigma^{2}(u,{\widehat{W}}(v)))\bigr{)}

σ^{2} (u, w) := \frac{{ D ℓ ( u + lo g [ c ^ _{d} w ]) } ^{2}}{( d + 1 ) w} .

σ^{2} (u, w) := \frac{{ D ℓ ( u + lo g [ c ^ _{d} w ]) } ^{2}}{( d + 1 ) w} .

\int_{0}^{\infty} e^{- λ s} ρ ν s^{d} d s = 1

\int_{0}^{\infty} e^{- λ s} ρ ν s^{d} d s = 1

(ξ (t), t \geq 0) =_{d} (ξ^{1} (λ t), t \geq 0),

(ξ (t), t \geq 0) =_{d} (ξ^{1} (λ t), t \geq 0),

e^{- λ u} E M_{0} (u) \leq c_{1}; e^{- 2 λ u} E {M_{0}^{2} (u)} \leq c_{2};

e^{- λ u} E M_{0} (u) \leq c_{1}; e^{- 2 λ u} E {M_{0}^{2} (u)} \leq c_{2};

M_{d} (u) = u^{d} + \int_{(0, u]} (u - v)^{d} M_{0} (d v) = d \int_{0}^{u} (u - v)^{d - 1} M_{0} (v) d v .

M_{d} (u) = u^{d} + \int_{(0, u]} (u - v)^{d} M_{0} (d v) = d \int_{0}^{u} (u - v)^{d - 1} M_{0} (v) d v .

e^{- λ u} E M_{d} (u) \leq c_{1} d! λ^{- d}; e^{- 2 λ u} E {M_{d}^{2} (u)} \leq c_{2} {d! λ^{- d}}^{2}, u > 0,

e^{- λ u} E M_{d} (u) \leq c_{1} d! λ^{- d}; e^{- 2 λ u} E {M_{d}^{2} (u)} \leq c_{2} {d! λ^{- d}}^{2}, u > 0,

M_{l} (t) = j = 1 \sum M_{0} (t) (t - τ_{j - 1})^{l}, l \geq 1,

M_{l} (t) = j = 1 \sum M_{0} (t) (t - τ_{j - 1})^{l}, l \geq 1,

\begin{array}[]{rl}\dfrac{d}{dt}M_{1}(t)&=\ M_{0}(t)\quad\mbox{for a.e. }t;\\[8.61108pt] \dfrac{d}{dt}M_{i}(t)&=\ iM_{i-1}(t),\quad i\geq 2.\end{array}

\begin{array}[]{rl}\dfrac{d}{dt}M_{1}(t)&=\ M_{0}(t)\quad\mbox{for a.e. }t;\\[8.61108pt] \dfrac{d}{dt}M_{i}(t)&=\ iM_{i-1}(t),\quad i\geq 2.\end{array}

M_{0} (t) = M_{0} (0) + Z (ρ ν \int_{0}^{t} M_{d} (u) d u) .

M_{0} (t) = M_{0} (0) + Z (ρ ν \int_{0}^{t} M_{d} (u) d u) .

\begin{array}[]{rl}\dfrac{d}{dt}H_{1}(t)&=\ \lambda H_{0}(t)\quad\mbox{for a.e. }t;\\[8.61108pt] \dfrac{d}{dt}H_{i}(t)&=\ \lambda H_{i-1}(t),\quad i\geq 2;\end{array}

\begin{array}[]{rl}\dfrac{d}{dt}H_{1}(t)&=\ \lambda H_{0}(t)\quad\mbox{for a.e. }t;\\[8.61108pt] \dfrac{d}{dt}H_{i}(t)&=\ \lambda H_{i-1}(t),\quad i\geq 2;\end{array}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComplex Network Analysis Techniques · Evolutionary Game Theory and Cooperation · Opinion Dynamics and Social Influence

Full text

A central limit theorem for the gossip process

A. D. Barbour111Institut für Mathematik, Universität Zürich, Winterthurertrasse 190, CH-8057 ZÜRICH. Work begun while ADB was Saw Swee Hock Professor of Statistics at the National University of Singapore, and supported in part by Australian Research Council Grants Nos DP120102728, DP120102398, DP150101459 and DP150103588.

and A. Röllin222Department of Statistics and Applied Probability, National University of Singapore, 6 Science Drive 2, 117546 Singapore. Supported in part by NUS Research Grant R-155-000-167-112 and Australian Research Council Grant No. DP150101459.

Universität Zürich and National University of Singapore

Abstract

The Aldous gossip process represents the dissemination of information in geographical space as a process of locally deterministic spread, augmented by random long range transmissions. Starting from a single initially informed individual, the proportion of individuals informed follows an almost deterministic path, but for a random time shift, caused by the stochastic behaviour in the very early stages of development. In this paper, it is shown that, even with the extra information available after a substantial development time, this broad description remains accurate to first order. However, the precision of the prediction is now much greater, and the random time shift is shown to have an approximately normal distribution, with mean and variance that can be computed from the current state of the process.

Keywords.

Gossip process, deterministic approximation, branching processes, central limit theorem

MRC subject classification.

92H30; 60K35, 60J85.

1 Introduction

A model for the dissemination of information in space, in which random long-range contacts facilitate spread, was introduced in Aldous (2012). In an idealized version, proposed by Chatterjee & Durrett (2011), individuals are represented as a continuum, evenly distributed over a two-dimensional torus of large area $L$ . Information spreads locally at constant rate from individuals to their neighbours, so that a disc of informed individuals, centred on an initial informant, grows steadily in the torus. However, information is also spread by long range transmissions to other, randomly chosen points of the torus, according to a Poisson process, whose rate is proportional to the area of currently informed individuals. Any such transmission initiates a new disc of informed individuals. The process can also be interpreted as a model of the spread of an SI disease, in which local infection is supplemented by occasional long-range contacts.

With $L_{t}$ denoting the area of informed individuals by time $t$ , Chatterjee & Durrett (2011) showed that, after some randomness in the initial stages of the process, the proportion of the torus $L_{t}/L$ that has been informed by time $t$ closely follows a particular, deterministic path. The times at which $L_{t}/L$ increases from almost zero to almost one is relatively short, and occurs around a time $t_{L}$ , which is a fixed multiple of $\log L$ . In what follows, we therefore concentrate on times relative to $t_{L}$ . Roughly speaking, Chatterjee & Durrett (2011) showed that, for large $L$ , we have

[TABLE]

for some function $\ell$ , where $\lambda$ is a scaling factor related to the speed of spread of information, and where $U$ is a random variable. The path $\ell$ is the same for all realizations of the process, but the position on the path at a particular time varies from realization to realization because of the random time shift $U$ . This result was generalized to gossip processes on rather general homogeneous Riemannian manifolds by Barbour & Reinert (2013), hereafter referred to as [BR], as well as to related ‘small world’ processes; they also derived a uniform bound on the approximation error. In addition, the equation describing the deterministic development was interpreted in terms of the Laplace transform of the limiting random variable corresponding to an associated Crump–Mode–Jagers (CMJ) branching process (Jagers, 1975).

By analogy with the theory of Markov population processes (Kurtz 1970, 1971), one might expect that the fluctuations around the deterministic path of the proportions informed would be approximately Gaussian, with standard deviation $O(L^{-1/2})$ , at least while the proportion informed is not too small or too close to $1$ . Here, however, the random quantity of most interest — the difference between the actual course of the process and a prediction of the course based on information available early in its development — involves the fluctuations of the process while the proportion informed is rather small, and the standard analogy does not apply. Instead, in view of the approximation already established, it seems reasonable at times $v\ll t_{L}$ to predict the value of $L_{t_{L}+u/\lambda}/L$ by $\ell(u+{\widehat{U}}(v))$ , where ${\widehat{U}}(v)$ is the expected value of $U$ , given the information at time $v$ , and to augment the point prediction with a confidence interval around $\ell(u+{\widehat{U}}(v))$ , derived from the (approximate) conditional distribution of $L_{t_{L}+u/\lambda}/L$ , given the current information.

The validity of the procedure is justified in detail in Section 3. The broad argument is to exploit the fact that $L_{t_{L}+u/\lambda}/L$ is the probability that a point $K$ , chosen independently and uniformly at random in ${\mathcal{C}}$ , belongs to the informed set ${\cal L}_{t_{L}+u/\lambda}$ :

[TABLE]

As it stands, this changes nothing. However, it indicates that a good approximation might be obtained by replacing ${\mathbb{P}}[K\in{\cal L}_{t_{L}+u/\lambda}\,|\,{\cal L}_{t_{L}+u/\lambda}]$ by ${\mathbb{P}}[K\in{\cal L}_{t_{L}+u/\lambda}\,|\,{\cal L}_{s}]$ , or, equivalently, replacing $L_{t_{L}+u/\lambda}/L$ by ${\mathbb{E}}\{L_{t_{L}+u/\lambda}/L\,|\,{\cal L}_{s}\}$ , for $s<t_{L}+u/\lambda$ chosen so that $s$ is close enough to $t_{L}+u/\lambda$ . In particular, for prediction from $v$ , we need to choose $s\in(v,t_{L}+u/\lambda)$ so that

[TABLE]

where ${\mathbb{E}}_{v}$ and ${\rm SD}_{v}$ denote expectation and standard deviation given the information at time $v$ .

The advantage of using ${\mathbb{E}}\{L_{t_{L}+u/\lambda}/L\,|\,{\cal L}_{s}\}$ is that ${\mathbb{P}}[K\in{\cal L}_{t_{L}+u/\lambda}\,|\,{\cal L}_{s}]$ can be approximated as the probability of at least one of many small balls, with centres chosen independently and at random in ${\mathcal{C}}$ , intersecting ${\cal L}_{s}$ . These balls are the islands in an independent ‘backwards’ gossip process, run for a length of time $t_{L}+(u/\lambda)-s$ from $K$ . There are many such balls if $t_{L}+(u/\lambda)-s$ is not too small, and the intersection probability can be approximated by a Poisson probability, using the Stein–Chen method; see Lemma 3.3. The mean of the Poisson distribution can, with considerable effort, be shown to be close to $\ell(\log[CW(s,v)]+u)$ , where $W(s,v)$ is a quantity that can be simply expressed in terms of a carefully chosen branching process, and $C$ is a constant. Now, given the information available at time $v$ , the quantity $W(v,v)$ (which loosely corresponds to $\exp\{{\widehat{U}}(v)\}$ ) is known, and the conditional distribution of the difference $W(s,v)-W(v,v)$ is approximately normal, as is shown in Theorem 2.8 in Section 2. This, in turn, leads to a normal approximation for the difference between $\ell(\log[CW(s,v)]+u)$ and its prediction $\ell(\log[CW(v,v)]+u)$ at time $v$ . This implies the main result of the paper, that

[TABLE]

for suitable choice of the standard deviation $\sigma$ depending on $u$ and $W(v,v)$ ; a precise statement is given in Theorem 1.1. The error in the normal approximation is shown to be small if the number of individuals informed at time $v$ is large, even if their proportion in the whole population may be very small. For practical purposes, in an epidemic, the very earliest development may well pass almost unnoticed — the origins are often obscure — but prediction on the basis of the information gained from the first few hundred cases is an important public health goal, in which case using the normal approximation is reasonable.

1.1 Detailed formulation

We now describe the problem in more detail. We consider the gossip process $({\cal L}_{t},\,t\geq 0)$ evolving on a smooth closed homogeneous Riemannian manifold ${\mathcal{C}}$ of dimension $d$ , such as a sphere or a torus, having large finite volume $|{\mathcal{C}}|=:L$ with respect to its intrinsic metric. An individual at point $P\in{\mathcal{C}}$ informed at time [math] gives rise to deterministic local spread that informs the set ${\cal K}(P,s)$ by time $s>0$ ; in addition, random ‘long range transmissions’ to independent and uniformly distributed points of ${\mathcal{C}}$ occur at rate $\rho$ times the intrinsic volume of the set currently informed. Thus the process can be constructed from knowledge of the points $0=\tau_{0}<\tau_{1}<\cdots$ of a point process $\Pi$ on ${\mathbb{R}}_{+}$ (characterized immediately below), together with an independent sequence of independent points $P_{1},P_{2},\ldots$ , uniformly distributed in ${\mathcal{C}}$ , and an initial point $P=P_{0}$ . The informed set and its volume are denoted by

[TABLE]

The point process $\Pi$ is simple, and has conditional intensity $\rho L_{t}$ at time $t$ with respect to the filtration $({\cal F}_{t},\,t\geq 0)$ , where ${\cal F}_{t}:=\sigma((\tau_{j},P_{j}),\,j\geq 0,\tau_{j}\leq t)$ .

The sets ${\cal K}(P,s)$ are assumed to be closed balls, centred at $P$ and of radius $s$ , with respect to a metric that makes ${\mathcal{C}}$ a geodesic space: $P^{\prime}\in{\cal K}(P,2t)$ exactly when ${\cal K}(P,t)\cap{\cal K}(P^{\prime},t)\neq\emptyset$ . Since ${\mathcal{C}}$ is assumed to be homogeneous, the volume of ${\cal K}(P,s)$ is independent of $P$ , and we will therefore denote it by ${\nu}_{s}={\nu}_{s}({\cal K})$ . The sets ${\cal K}(P,s)$ are also assumed to be locally almost Euclidean in the sense that ${\nu}_{s}\approx s^{d}{\nu}$ for some constant ${\nu}={\nu}({\cal K})>0$ . More precisely, we will assume that, for constants $c_{g},\gamma_{g}>0$ ,

[TABLE]

The quantity ${\nu}>0$ has physical dimensions $(\mbox{length}/\mbox{time})^{d}$ , so that ${\nu}^{1/d}$ can be interpreted as a local velocity of spread of information in any particular direction. Assumption (1.4) is satisfied, for instance, for balls with respect to geodesic distance on the surface of a $(d+1)$ -dimensional sphere of large radius $R$ , when $L=c_{d}R^{d}$ and

[TABLE]

(Li, 2011), in which case we can take $\gamma_{g}=2$ in all dimensions $d\geq 2$ .

Using (1.4), the probability of there being no long range transmission before time $u$ is given by

[TABLE]

so that the mean time to the first long range transmission is approximately

[TABLE]

Thus

[TABLE]

having physical dimensions $(1/\mbox{time})$ , is such that $1/\lambda$ represents the time scale for the first long range transmission, and then $\lambda^{-d}{\nu}$ reflects the size of the initial neighbourhood when the first long range transmission occurs; the exact specification of $\lambda$ is to make it equal to the growth rate of the associated CMJ process ([BR], p.986). For our approximations to be good, the size of the initial neighbourhood when the first long range transmission occurs should be small compared to $L$ , so that, defining

[TABLE]

a quantity without physical dimension, we shall take $\Lambda$ to be large. Note that, if this is so, the approximations made above have small error, in view of (1.4).

To start with, the points of $\Pi$ closely match the birth events of a CMJ process ${\overline{X}}$ , whose birth intensity as a function of age $s$ is given by $\rho{\nu}_{s}$ . In fact, the approximation ${\overline{\cal L}}_{t}$ of ${\cal L}_{t}$ , constructed by using the CMJ process ${\overline{X}}$ to approximate $\Pi$ and with the same sequence of points $(P_{j},\,j\geq 1)$ , is excellent for times $t\leq\alpha\lambda^{-1}\log\Lambda$ if $\alpha<1/2$ ([BR], §2.2), and still gives an approximation to the volume $L_{t}$ of ${\cal L}_{t}$ at time $t$ that is accurate to the first order if $\alpha<1$ ([BR], Theorem 3.2 and (2.23)). This CMJ approximation takes the form

[TABLE]

for a constant $K$ , where $W$ is a limiting random variable associated with the CMJ process ${\overline{X}}$ . Taking

[TABLE]

with $u\leq(\alpha-1)\log\Lambda$ large and negative in the range in which this approximation holds, this implies that $L_{t_{\Lambda}(u-\log W)}/L$ closely follows the curve

[TABLE]

where $\ell_{0}(u):=Ke^{u}$ .

In [BR], Theorem 3.2, an analogous approximation

[TABLE]

is established, with uniformly small error, for all values of $u$ , with ${\hat{c}}_{d}$ defined before (1.11), and with the time shift $U$ given by $\lambda^{-1}\log W+c$ , for a suitably chosen constant $c$ . Clearly, to be compatible with (1.9), $\ell(u)\sim Ke^{u}$ as $u\to-\infty$ , as follows from ([BR], following (2.23)).

For any fixed $u$ , the distribution of $L_{t_{\Lambda}(u)}/L$ is close to that of $\ell(u+\log W+\log{\hat{c}}_{d})$ , and is a bounded random variable. Hence it can only be approximately normally distributed, after appropriate centring and normalization, in circumstances in which the distribution of $\log W$ is concentrated close to some fixed value. This is not true of the distribution of $W$ at time [math]. However, when predicting from a time $v=\alpha\lambda^{-1}\log\Lambda$ for any fixed $\alpha$ , $0<\alpha<1$ , the conditional distribution of $W$ , given the information up to time $v$ , is concentrated close to an approximation $W(v,v)$ provided only that $\alpha>0$ , even though the size of the informed set is still relatively small when compared to $L$ for any $\alpha<1$ . The aim is now to show that the difference $\Delta(v):=W(v,v)-W$ , suitably normalized, is approximately normally distributed.

It turns out to be easier to work with a ‘flattened’ CMJ process ${\widehat{X}}$ , rather than with the original CMJ process ${\overline{X}}$ . The process ${\widehat{X}}$ has birth rate at age $s$ given by $\rho s^{d}{\nu}$ , and is thus the same process for all $L$ , whereas ${\overline{X}}$ depends implicitly on $L$ through the function ${\nu}_{s}$ . The quantity $\lambda$ then turns out to be the Malthusian parameter of ${\widehat{X}}$ . In a CMJ process with Malthusian parameter $\mu$ , at large times, a randomly sampled individual has average age approximately $1/\mu$ . For ${\widehat{X}}$ , $\mu=\lambda$ , and replacing $s$ by $1/\lambda$ in (1.4) confirms that the two CMJ processes ${\overline{X}}$ and ${\widehat{X}}$ have birth rates that are close to each other if $\Lambda$ is large. The essentials of the proof of the normal approximation to $\Delta(v)$ are carried out in Section 2. The argument hinges on examining a collection of (complex valued) martingales $(W_{j}(\cdot),\,0\leq j\leq d)$ associated with ${\widehat{X}}$ , that are defined in (2.13) below. In particular, $W(t,v):=W_{0}(t)$ , $t\geq v$ , is non-negative and square integrable, having limit $W_{0}(\infty)=:W$ . It is then shown that $W_{0}(v)-W$ , suitably normalized, is close enough to the integral of a function $f(W_{0}(v),u)$ with respect to an independent standard Brownian motion $B(u)$ , giving the normal approximation.

The arguments in Section 3, as outlined before (1.2), rely heavily on comparisons between birth and growth processes. The actual process $({\cal L}_{t},\,t\geq 0)$ is compared with the branching approximation ${\overline{X}}$ , and ${\overline{X}}$ is compared to its flattened version ${\widehat{X}}$ . Further (flattened) CMJ processes ${\widehat{X}}^{+}$ and ${\widehat{X}}^{-}$ are then introduced, to act as upper and lower bounds for ${\overline{X}}$ ; the comparison is formalized in Lemma 3.1. All the detailed computations in Section 3 are made using these processes, including the reduction of the intersection probability in Lemma 3.3 to a tractable form in Lemma 3.6.

To state our theorem, we take

[TABLE]

as an approximation to $W$ , where the set ${\widehat{J}}_{v}$ indexes the set of all non-intersecting neighbourhoods of ${\cal L}_{v}$ . For each of these, the radii $(v-\tau_{j})$ can be determined, and so ${\widehat{W}}(v)$ can be derived from ${\cal L}_{v}$ . Then let ${\hat{c}}_{d}:=d!/(d+1)$ , and

[TABLE]

and define

[TABLE]

where $W$ is as above; see also (2.13) and (2.18). Let $d_{{\rm BW}}$ denote the bounded Wasserstein distance between probability measures on ${\mathbb{R}}$ :

[TABLE]

where $F_{{\rm BW}}$ consists of all Lipschitz functions $f\colon{\mathbb{R}}\to[-1,1]$ whose Lipschitz constant is at most $1$ . The theorem is as follows.

Theorem 1.1

With the above definitions, suppose that $v=\alpha\lambda^{-1}\log\Lambda$ for $0<\alpha<2\min\{\gamma_{g}/d,\zeta(d)/(1+\zeta(d))\}$ , where $\gamma_{g}$ is as in (1.4). Then, for any $u_{1}<u_{0}\in{\mathbb{R}}$ , there exists a $\gamma>0$ and an event $E^{*}(v)\in\sigma({\cal L}_{v})$ with ${\mathbb{P}}[E^{*}(v)^{c}]=\mathop{{}\mathrm{O}}(\Lambda^{-\gamma})$ such that

[TABLE]

uniformly in $u_{1}\leq u\leq u_{0}$ , where $t_{\Lambda}(u)=\lambda^{-1}(\log\Lambda+u)$ as in (1.8) and

[TABLE]

So, for instance, for spherical neighbourhoods in $d\leq 6$ , it is possible to take any $\alpha$ strictly between [math] and $2/3$ in Theorem 1.1. The order statements can be replaced by inequalities, valid for all $\Lambda$ sufficiently large, in which the constants depend only on $d,u_{1}$ and $u_{0}$ ; however, the lower bound on the value of $\Lambda$ then also involves $\alpha$ and the constants $c_{g}$ and $\gamma_{g}$ from (1.4).

In fact, the proof shows a little more: that we could realize the normal random variables ${\cal N}(0,\sigma^{2}(u,W(v,v)))$ , for different values of $u$ , as $\sigma(u,W(v,v))N$ for the same standard normal random variable $N$ . The interpretation of this is that the fluctuations in $L_{t_{\Lambda}(u)}/L$ are essentially those of $\ell(u+\log[{\hat{c}}_{d}W])$ , and that the remaining randomness after time $v$ is overwhelmingly that of the difference $W-W(v,v)$ , a single random variable. This, at first sight surprising, result reflects the phenomenon common to branching processes, that the randomness determining the growth of a super-critical branching process occurs at the very beginning of its development.

2 The branching process

In this section, we investigate the limit $W$ , as $t\to\infty$ , of a martingale $W(t)$ associated with a particular CMJ branching process. We show that $(W(t)-W)$ is approximately normally distributed, and give an explicit bound on the accuracy of the approximation. Although, for a (multitype) Galton–Watson process, a central limit theorem of this sort is not difficult to establish (Asmussen & Hering, Theorem 7.1), the corresponding theorems for general CMJ processes seem not to be available. Here, we are able to exploit the particular structure of our CMJ process to prove what we need.

We start by identifying the branching process that we work with, which can be expressed as a Markov process in a $(d+1)$ -dimensional space. The properties of the coordinate processes $(H_{j}(t),\,0\leq j\leq d)$ , and of some equivalent (complex valued) martingales $(W_{j}(t),\,0\leq j\leq d)$ are established in Lemma 2.1. The component $W_{0}$ is a non-negative real valued martingale, and $W$ is its limit as $t\to\infty$ . Using Kolmogorov’s inequality, the fluctuations of the sample paths of the processes $W_{j}$ are controlled in Lemma 2.2, and this in turn gives control over the processes $H_{j}$ .

The martingale difference $W_{0}(v+t)-W_{0}(v)$ is written in (2.23) as an integral of an explicit function of the process $H_{d+1}(u):=\lambda\int_{0}^{h}H_{d}(w)\,dw$ with respect to a standard compensated Poisson process. Using the control that we have over the $H_{j}$ , we determine successively simpler approximations to this process, in (2.29) and (2.31), at each stage making sure that the error incurred is sufficiently small (Lemma 2.4 and Corollary 2.6). Finally, in (2.35), an expression is obtained in which integration with respect to the compensated Poisson process has been replaced by integration with respect to standard Brownian motion, and this can be used with an error controlled in Lemma 2.7. The results of these steps are collected as a functional approximation in Theorem 2.8. The version that is used to prove Theorem 3.9 in Section 3 is given as Corollary 2.10.

2.1 Properties of the flattened process

The first step is to determine a suitable $W$ . We do so by way of a ‘flattened’ version ${\widehat{X}}$ of the CMJ branching process ${\overline{X}}$ . The process ${\widehat{X}}$ is the counting process associated with a point process $({\hat{\tau}}_{j},\,j\geq 0)$ on ${\mathbb{R}}_{+}$ , with ${\hat{\tau}}_{0}=0$ a.s., whose compensator is given by ${\widehat{A}}(t):=\int_{0}^{t}{\hat{a}}(u)\,du$ , where ${\hat{a}}(u):=\rho{\nu}\sum_{j:\,{\hat{\tau}}_{j}\leq u}(u-{\hat{\tau}}_{j})^{d}$ , and where $\rho$ , as before, denotes the intensity per unit volume. At time $t$ , ${\widehat{X}}(t)$ can be thought of as consisting of $M_{0}(t):=1+\max\{r\geq 0\colon\,{\hat{\tau}}_{r}\leq t\}$ neighbourhoods, whose volumes at time $t$ are given by $(t-{\hat{\tau}}_{r})^{d}{\nu}$ , asymptotically close to, but not the same as the volume ${\nu}_{t-{\hat{\tau}}_{r}}$ . The intensity ${\hat{a}}$ is then precisely that of a CMJ process, in which neighbourhoods play the part of individuals, and the point process $\xi$ of an individual’s offspring is an inhomogeneous Poisson process with rate $\rho{\nu}s^{d}$ at age $s$ . The mean number of offspring of an individual is thus infinite, but the Malthusian parameter $\lambda$ , chosen so that the equation

[TABLE]

is satisfied, is finite, and is given by $\lambda:=(d!\rho{\nu})^{1/(d+1)}$ . Note that

[TABLE]

where $\xi^{1}$ is the inhomogeneous Poisson process with rate $s^{d}/d!$ at age $s$ .

We can immediately deduce some useful general properties of the process ${\widehat{X}}$ . To start with, because the variance of the discounted offspring number $\int_{0}^{\infty}e^{-\lambda s}\xi(ds)$ is finite, being given by $\int_{0}^{\infty}e^{-2\lambda s}\rho{\nu}s^{d}\,ds$ , it follows from Ganuza & Durham (1974, Theorem 1) that there exist finite constants $c_{1}$ and $c_{2}$ such that, for all $u>0$ ,

[TABLE]

in view of (2.1), $c_{1}$ and $c_{2}$ depend only on $d$ . Then the intensity ${\hat{a}}(u)$ can be expressed as $\rho{\nu}M_{d}(u)$ , where

[TABLE]

This in turn implies from (2.2) that

[TABLE]

using Cauchy–Schwarz for the second inequality.

However, ${\widehat{X}}$ also has special structure that will prove useful in what follows, relating to the sums

[TABLE]

of the $l$ -th powers of the ages of the neighbourhoods. Note that $M_{d}(t)$ is as defined previously, and that

[TABLE]

Since $M_{0}$ has intensity ${\hat{a}}=\rho{\nu}M_{d}$ , letting $Z$ denote a unit rate Poisson process, we can write

[TABLE]

Defining $H_{i}(t):=M_{i}(t)\lambda^{i}/i!$ , for any $\lambda>0$ , the equations (2.6) reduce to

[TABLE]

with the particular choice $\lambda:=(d!\rho{\nu})^{1/(d+1)}$ , equation (2.7) becomes

[TABLE]

so that ${\widehat{A}}(t)=H_{d+1}(t)$ . In particular, from (2.8) and (2.9), it follows that the process ${\widetilde{H}}$ defined by

[TABLE]

is a Markov process. It also follows directly from (2.8) and (2.9), or as a consequence of (2.1), that

[TABLE]

where ${\widetilde{H}}^{1}$ denotes the process with $\lambda=1$ . Note that $\rho$ may depend on $L$ , as also may $\lambda$ .

In order to describe the properties of the process ${\widehat{X}}$ in more detail, we introduce the (complex valued) processes

[TABLE]

where $x_{j}:=\exp\{2\pi{\imath}j/(d+1)\}\in{\mathbb{C}}$ , $j\in\{0,1,\ldots,d\}$ , which are martingales with respect to the natural filtration $({\widehat{\cal F}}_{t},\,t\geq 0)$ of ${\widehat{X}}$ . In particular, for $j=0$ , we have $x_{j}=1$ , and

[TABLE]

is a real valued, càdlàg martingale, and plays a key part our arguments. It is shown in the next lemma that it is also non-negative, and the rest of the section is then devoted to proving a normal approximation to $e^{\lambda t/2}(W(t)-W(\infty))$ , which is the basis for the central limit theorem for the gossip process itself. Note that the distribution of $W(\cdot)$ can be derived from the corresponding martingale $W^{1}(\cdot)$ for the process with $\lambda=1$ , since, from (2.11),

[TABLE]

from this, it also follows that the distribution of $W(\infty)$ is the same for all $\lambda$ . The remaining martingales $W_{j}$ are useful, because they enable the quantities $H_{j}(\cdot)$ to be expressed in a tractable form, as in the next lemma.

Lemma 2.1

With notation as above, we have

[TABLE]

and

[TABLE]

Proof: It follows from (2.8) that, for any $x\in{\mathbb{C}}$ ,

[TABLE]

and, by partial integration, that

[TABLE]

Hence

[TABLE]

and thus

[TABLE]

Taking $x=x_{j}$ for any $j\in\{0,1,\ldots,d\}$ , we have $x^{d+1}=1$ , making the right hand side equal to $W_{j}(t)$ , because $\lambda H_{d}(u)\,du=H_{d+1}(du)={\widehat{A}}(du)$ , by (2.8) and (2.9); hence

[TABLE]

The first statement of the lemma follows by taking $j=0$ , and the second by using the orthogonality relation $\sum_{l=0}^{d}x_{j}^{l}x_{l}^{r}=(d+1)\delta_{jr}$ .

Now, writing $r_{j}:=\Re{x_{j}}$ and noting that ${\hat{a}}(u)=\lambda H_{d}(u)\leq\lambda e^{\lambda u}W(u)$ , it follows from (2.12) that, for $0\leq j\leq d$ and for $v<t<w$ ,

[TABLE]

Using this bound with $v=0$ , we see that the variances of the terms with $1\leq l\leq d$ in the sum in Lemma 2.1 converge to zero as $t\to\infty$ . However, the term with $l=0$ remains significant as $t\to\infty$ , since, by (2.17) with $v=0$ and $j=0$ , it follows that $W(\cdot)$ is square integrable, and that

[TABLE]

Note that the distribution of $W$ , through its Laplace transform $\phi_{\infty}$ as in (1.12), already appears in the statement of Theorem 1.1, and is the same for all $\lambda$ , as remarked following (2.14). Thus each of the $H_{j}$ satisfies

[TABLE]

We shall exploit more detailed versions of these asymptotics in Section 3.

In order to use Lemma 2.1 to describe further the behaviour of the $H_{j}(t)$ , we need good control of the fluctuations of the processes $(W_{l},\,0\leq l\leq d)$ . As indicated by (2.17), their asymptotic behaviour depends substantially on whether or not $r_{l}>1/2$ . Note, for future reference, that $\min\{(1-r_{1}),1/2\}=\zeta(d)$ , where $\zeta(d)$ is as in (1.11).

Lemma 2.2

For any $1\leq l\leq d$ and $0<\eta<\min\{(1-r_{l}),1/2\}$ , and for any $K>0$ , define the events

[TABLE]

similarly, for $0<\eta<1/2$ , define

[TABLE]

Then there exist constants $C(l,\eta)$ , $0\leq l\leq d$ , such that, for all $K>0$ ,

[TABLE]

Proof: Combining (2.16) with (2.10), it follows that ${\cal L}\bigl{(}(W_{0}(s),\ldots,W_{d}(s)),\,s\geq v\,|\,{\widehat{\cal F}}_{v}\bigr{)}$ depends on ${\widehat{\cal F}}_{v}$ only through the value of ${\widetilde{H}}(v)$ . Then, noting that, for $r+\eta\leq 1$ , $1\leq l\leq d$ and for any $w>t\geq v$ ,

[TABLE]

and using Kolmogorov’s inequality on the real and imaginary parts of $W_{l}$ , it follows that

[TABLE]

For $r_{l}>1/2$ , taking $w=\infty$ , it follows from (2.17) that

[TABLE]

For $r_{l}=1/2$ , taking $t=v+j\lambda^{-1}$ and $w=v+(j+1)\lambda^{-1}$ , it follows from (2.17) that

[TABLE]

and adding over $j\in{\mathbb{Z}}_{+}$ gives

[TABLE]

For $r_{l}<1/2$ , taking $t=v+j\lambda^{-1}$ and $w=v+(j+1)\lambda^{-1}$ , it follows from (2.17) that

[TABLE]

and adding over $j\in{\mathbb{Z}}_{+}$ gives

[TABLE]

For $l=0$ , the result is proved in analogous fashion, starting from

[TABLE]

and observing that, from (2.17),

[TABLE]

As a result of this lemma, we can sharpen (2.19) by giving an explicit bound on the error made when approximating $e^{-\lambda t}H_{j}(t)$ by $W(v)/(d+1)$ for any $t\geq v$ . To state the bound, we define

[TABLE]

noting that, on $E_{1}^{\eta}(v)$ , $Q(t)\leq Q(v)+d$ for all $t\geq v$ . Then for all $t\geq v$ and $0\leq j\leq d$ , and if $\eta<\zeta(d)$ , we have

[TABLE]

on $E_{1}^{\eta}(v)$ . Furthermore, from Lemma 2.2,

[TABLE]

2.2 Approximating an integral representation of $W(v+t)-W(v)$

The aim of this section is to prove an approximation theorem, when $v$ is large, for the process $X_{v}^{(0)}(t):=W(v+t)-W(v)$ in $t\geq 0$ . We recall (2.7) and (2.9), and use the representation (2.12), writing

[TABLE]

where $Z^{(1)}$ is a unit rate Poisson process, with increments independent of ${\widehat{\cal F}}_{v}$ , starting with $Z^{(1)}(H_{d+1}(v))=M_{0}(v)=H_{0}(v)$ , and where $H_{l}(u)$ , $l\geq 0$ , are constructed in $u\geq v$ from the Poisson process $Z^{(1)}$ , using (2.8) and (2.9), with initial values $H_{l}(v)$ , $0\leq l\leq d$ . Once again, the process $X_{v}^{(0)}$ depends on its past ${\widehat{\cal F}}_{v}$ only through ${\widetilde{H}}(v)$ . Since the expression (2.23) is too complicated to use directly, we simplify it in a series of stages.

We start by approximating $H_{d+1}^{-1}(w)$ in $w\geq H_{d+1}(v)$ . In view of (2.21), we have $H_{d+1}(t)\approx e^{\lambda t}W(v)/(d+1)$ , or $w\approx e^{\lambda H_{d+1}^{-1}(w)}W(v)/(d+1)$ ; the precise result is as follows. Note that, for our purposes, $\gamma^{\eta}(v)$ can be thought of as small.

Lemma 2.3

Fix any $\eta<\zeta(d)$ . Then, on the event $E_{1}^{\eta}(v)$ , we have

[TABLE]

for all $w\geq\{W(v)/(d+1)\}e^{\lambda v}$ , where $\gamma^{\eta}(v):=(d+1)\{Q(v)/W(v)\}e^{-\lambda\eta v}$ , $H^{*}(v):=H_{d+1}(v)-e^{\lambda v}W(v)/(d+1)$ , and $Q(v)$ is as defined in (2.20).

Proof: We begin by noting that $H_{d+1}(u)=\int_{0}^{u}\lambda H_{d}(t)\,dt$ , so that, from (2.21), for $u\geq v$ ,

[TABLE]

So, defining

[TABLE]

it follows that, on $E^{\eta}_{1}(v)$ ,

[TABLE]

Now substitute $s=t_{v}^{-1}(u)$ into (2.26) for $u\geq 0$ , giving

[TABLE]

Writing $w=H_{d+1}(u+v)$ and inverting, it then follows immediately that

[TABLE]

establishing the lemma.

This now allows (2.23) to be rewritten in the form

[TABLE]

where $Z^{(2)}$ is a unit rate Poisson process, with respect to which both upper limit and integrand are predictable, the latter being decreasing in $w$ and bounded between

[TABLE]

for all $w\geq 0$ , on the event $E_{1}^{\eta}(v)$ . In order to show that we can replace both the integrand and the upper limit of integration in (2.27) with simpler expressions, without making too great an error, we use Lemma 4.1 from the Appendix.

We first replace the integrand in (2.27), showing that $X_{v}^{(0)}$ is close to $X_{v}^{(1)}$ , defined by

[TABLE]

using (2.28). We set

[TABLE]

Lemma 2.4

With the above definitions, for any $\eta<\zeta(d)$ and any $v\geq v_{-}(\eta)$ , we have

[TABLE]

where $\theta_{1}(v)$ is as in (2.22), and $\tilde{\theta}_{2}(v):=2e^{-W(v)e^{\lambda\eta v}/\{2e\}}$ .

Proof: It follows from (2.27) that $X(t)=X_{v}^{(0)}(t)-X_{v}^{(1)}(t)$ is an integral of the form considered in Lemma 4.1, albeit with a random upper limit, and its corresponding function $F$ satisfies

[TABLE]

on $E_{1}^{\eta}(v)$ , in view of (2.28). We can thus apply Lemma 4.1 to the process $\widetilde{X}$ with $\widetilde{F}(t):=F(t){\bf{1}}\{|F(u)|\leq G(u),\,0\leq u<t\}$ and with $\widetilde{G}(u):=G(u)$ as in (2.30), noting that then, recalling (2.22),

[TABLE]

Now, from (2.30), we have $\widetilde{G}_{2}(0,\infty)=\{\gamma^{\eta}(v)\}^{2}\{W(v)/(d+1)\}e^{-\lambda v}$ . We can then choose $a:=e^{-\lambda v/2}\{W(v)Q(v)\gamma^{\eta}(v)\}^{1/2}$ in Lemma 4.1, because

[TABLE]

if $v\geq v_{-}(\eta)$ , and the result follows.

The next step is to simplify the upper limit in (2.29), using Lemma 4.1 to show that, with $t_{v}(s)$ as defined in (2.25), $(X_{v}^{(1)}(t_{v}(s)),\,s\geq 0)$ is close to the process $(X_{v}^{(2)}(s),\,s\geq 0)$ given by

[TABLE]

For this, we need to control $\sup_{s\geq 0,\,|z|<h_{v}(s)}|X_{v}^{(2)}(s+z)-X_{v}^{(2)}(s)|$ , for $h_{v}(s)$ defined in (2.26).

Lemma 2.5

With the definitions given in (2.26), (2.29) and (2.31), and for any $\eta<\zeta(d)$ , we have

[TABLE]

where $\varepsilon^{\eta}(v):=\{W(v)Q(v)\}^{1/2}e^{-\lambda\eta v/3}$ , $g(v):=Q(v)(d+2)e^{-\lambda\eta v}$ and

[TABLE]

Proof: We consider the ranges $0\leq s\leq W(v)e^{\lambda v}$ and $s>W(v)e^{\lambda v}$ separately. In the first range of $s$ , define $s_{j}:=je^{\lambda v}g(v)$ for $0\leq j\leq M:=\lfloor W(v)/g(v)\rfloor$ , and set $s_{M+1}:=W(v)e^{\lambda v}$ : then $s_{j+1}-s_{j}\geq h_{v}(s_{j})$ for each $j$ . By Lemma 4.1, with $G(u)$ the constant $e^{-\lambda v}$ and $a:=e^{-\lambda v/2}\varepsilon^{\eta}(v)$ , we have

[TABLE]

for $0\leq j\leq M$ , since $a\leq eg(v)=eG(s_{j})(s_{j+1}-s_{j})$ on $E_{21}^{\eta}(v)$ . Hence, by a standard argument,

[TABLE]

In the second range of $s$ , we define

[TABLE]

noting that $s_{j+1}-s_{j}=s_{j}{\tilde{g}}(v)\geq h_{v}(s_{j})$ . By Lemma 4.1 with $G(u):=s_{j}^{-1}\{W(v)/(d+1)\}$ , we have

[TABLE]

since $a:=e^{-\lambda v/2}\varepsilon^{\eta}(v)\leq eg(v)=e\{W(v)/(d+1)\}{\tilde{g}}(v)=eG(s_{j})(s_{j+1}-s_{j})$ on $E_{21}^{\eta}(v)$ , and hence

[TABLE]

since also $\{\varepsilon^{\eta}(v)^{2}\}(d+1)^{2}/(2eW(v))\leq 1$ on $E_{21}^{\eta}(v)$ . We need $4\varepsilon^{\eta}(v)$ here as the bound on the supremum difference, rather than the usual $3\varepsilon^{\eta}(v)$ , because it is possible to have $s(1-{\tilde{g}}(v))<s_{j-1}$ for some $s_{j}<s<s_{j+1}$ ; however, it then has to be the case that, for such $s$ , $s(1-{\tilde{g}}(v))\geq s_{j-2}$ if ${\tilde{g}}(v)\leq 1/2$ , which is the case on $E_{21}^{\eta}(v)$ .

In view of Lemma 2.5 and (2.26), we immediately have the following corollary.

Corollary 2.6

With the definitions of Lemma 2.5,

[TABLE]

We now show that $X_{v}^{(2)}$ is close in distribution to the process $X_{v}^{(3)}$ defined by

[TABLE]

where, for the integrator, the compensated Poisson process $Z^{(2)}(w)-w$ from $X_{v}^{(2)}$ has been replaced by a standard Brownian motion $B(w)$ . Note that $e^{\lambda v/2}X_{v}^{(3)}$ is itself just a time-changed Brownian motion:

[TABLE]

and so, conditional on $W(v)$ , $X_{v}^{(3)}(\infty)\sim{\cal N}(0,W(v)/(d+1))$ .

Lemma 2.7

Fix $r\geq 1$ . Then there are constants $c_{r1}$ and $c_{r2}$ , depending only on $d$ , with the following properties. For all $v$ such that $\lambda v\geq c_{r1}$ , it is possible to construct $X_{v}^{(2)}$ and $X_{v}^{(3)}$ on the same probability space, in such a way that

[TABLE]

Proof: For any $r\geq 1$ , there are constants $C_{r},K_{r}$ with the property that, for any $n\geq 1$ , a standard Poisson process $Z$ and a standard Brownian motion $B$ can be constructed on the same probability space in such a way that ${\mathbb{P}}[A_{r}^{c}(n)]\leq K_{r}n^{-(r+1)}$ , where

[TABLE]

This follows from Komlós, Major & Tusnády (1975, Theorem 1 (ii)), together with elementary exponential bounds for the fluctuations of the standard Poisson process and Brownian motion over the time interval $[0,1]$ . Fix $r$ , and take $n:=e^{3\lambda v}$ for $v\geq v_{1}$ , where $v_{1}$ is chosen so that $e^{3\lambda v_{1}}\geq 2K_{r}$ , implying that ${\mathbb{P}}[A_{r}^{c}(n)]\leq\frac{1}{2}e^{-3r\lambda v}$ . Then use the corresponding choices of $Z$ and $B$ to realize $X_{v}^{(2)}$ and $X_{v}^{(3)}$ , which we express, by partial integration, in the form

[TABLE]

Taking the difference, it is immediate that, for $0\leq s\leq e^{3\lambda v}$ and on $A_{r}(e^{3\lambda v})$ ,

[TABLE]

and that

[TABLE]

This shows that, on $A_{r}(e^{3\lambda v})$ ,

[TABLE]

Then, taking $F(u)=W(v)/\{u(d+1)+W(v)e^{\lambda v}\}$ , $a=eC_{r}\{W(v)/(d+1)\}\lambda ve^{-\lambda v}$ , $t_{1}=e^{3\lambda v}$ and $t_{2}=\infty$ in Lemma 4.1, with the choice of $a$ permissible for all $v\geq v_{2}$ , where $v_{2}\geq\lambda^{-1}$ is chosen such that $\lambda v_{2}e^{-\lambda v_{2}}\leq 1/C_{r}$ , we have

[TABLE]

The same bound is satisfied also for $\sup_{e^{3\lambda v}\leq s<\infty}|X_{v}^{(3)}(s)-X_{v}^{(3)}(e^{3\lambda v})|$ , as can be deduced from the representation (2.36). Now choose $v_{3}\geq\lambda^{-1}$ so that $8\exp\{-(e/2)(C_{r}\lambda v_{3})^{2}e^{\lambda v_{3}}\}\leq e^{-3r\lambda v}$ , and set $v_{0}:=\max\{v_{1},v_{2},v_{3}\}$ .

Summarizing the conclusions Lemmas 2.4 and 2.7 and of Corollary 2.6, we have the following theorem. In the error terms, $\theta_{1}(v)$ is defined in (2.22), $\theta_{2}(v)$ in Lemma 2.4, $\theta_{3}(v)$ in Lemma 2.5 and $\theta_{4}(v)$ in Lemma 2.7.

Theorem 2.8

With the definitions (2.12), (2.25) and (2.35), fixing any $\eta<\zeta(d)$ , we can construct $W$ and a time changed Brownian motion $X_{v}^{(3)}$ on the same probability space, in such a way that, for all $v\geq\lambda^{-1}c_{1*}$ ,

[TABLE]

where $K(v):=4\{W(v)Q(v)\}^{1/2}+Q(v)\sqrt{d+1}+c_{2*}(1+W(v)e^{-\lambda v/3})$ , $E_{21}^{\eta}(v)\in\sigma({\widetilde{H}}(v))$ is as defined in (2.32), and the constants $c_{1*}$ and $c_{2*}$ , which depend only on $d$ , can be deduced from Lemma 2.7 with $r=1$ .

2.3 Consequences for the gossip process

Theorem 2.8 is not yet in a form easily applied to the gossip process. To start with, the statement of the theorem involves the $\sigma({\widetilde{H}}(v))$ -measurable random variables $W(v)$ , $Q(v)$ , $K(v)$ and $\theta_{i}(v)$ , $1\leq i\leq 4$ , and it is useful to have some idea of their magnitude. It is also useful to specify how big the probability ${\mathbb{P}}[E_{21}^{\eta}(v)]$ may be. To derive appropriate statements, we begin with the random elements $W(v)$ and $W_{l}(v)$ , $1\leq l\leq d$ .

Lemma 2.9

For any $0<\eta<\zeta(d)$ , we have

[TABLE]

for $1\leq l\leq d$ . Furthermore, for any $s>0$ ,

[TABLE]

for a suitably chosen $w_{0}>0$ .

Proof: The first part follows from (2.17) and Chebyshev’s inequality, and, for $W(v)$ , the bound on the upper tail holds because ${\rm Var\,}W(v)\leq{\rm Var\,}W(\infty)\leq 1$ and ${\mathbb{E}}W(v)=1$ . For the lower tail, note that $W(\infty)>0$ a.s., so that, because $W(\cdot)$ is càdlàg and positive on ${\mathbb{R}}_{+}$ , we have $W_{*}:=\inf_{t>0}W(t)>0$ a.s. also. Suppose that $w_{0}>0$ is such that ${\mathbb{P}}[W_{*}\geq w_{0}]\geq 1/2$ . Then, for $0<x\leq w_{0}$ , $W(t)>x$ if any of the offspring of the initial individual that are born before time $t_{x}$ generate families with $W_{*}>w_{0}$ , where $e^{-\lambda t_{x}}=x/w_{0}$ . The probability that there are no such offspring is just $\exp\{-\rho{\nu}t_{x}^{d+1}/\{2(d+1)\}\}$ . Hence, for $t\geq t_{x}$ and $x\leq w_{0}$ ,

[TABLE]

In view of (2.20), if $0<\eta<\zeta(d)$ , then $Q(v)\leq 3(d+1)$ on the event

[TABLE]

and the first part (2.38) of Lemma 2.9 directly implies that

[TABLE]

for a suitable constant $c(d)$ ; of course, by definition, $Q(v)\geq d+2$ . The second part of Lemma 2.9 implies that $E_{23}(v):=\{W(v)\leq 1+e^{\lambda\eta v/3}\}$ is such that ${\mathbb{P}}[\{E_{23}(v)\}^{c}]\leq e^{-2\lambda\eta v/3}$ . From these observations and (2.32), it follows that

[TABLE]

if $v$ is such that $e^{2\lambda\eta v/3}\geq(d+1)^{3}$ , and hence, for such $v$ ,

[TABLE]

in addition,

[TABLE]

on $E_{22}^{\eta}(v)\cap E_{23}(v)$ also.

For the quantities $\theta_{i}$ , $1\leq i\leq 4$ , note that, from (2.22),

[TABLE]

and that ${\mathbb{P}}[\{E_{24}^{\eta}(v)\}^{c}]\leq e^{-2\lambda v(\zeta(d)-\eta)}$ . Then, as in Lemma 2.7, $\theta_{4}(v)=e^{-3\lambda v}$ if we take $r=1$ . From Lemma 2.4, $\theta_{2}(v)=\theta_{1}(v)+{\tilde{\theta}}_{2}(v)$ , and both ${\tilde{\theta}}_{2}(v)$ and $\theta_{3}(v)$ , defined in Lemma 2.5, are super-exponentially small in $\lambda\eta v$ on the event $E_{25}^{\eta}(v):=\{W(v)\geq e^{-\lambda\eta v/6}\}$ . Finally, by the last inequality in Lemma 2.9,

[TABLE]

which is also super-exponentially small in $\lambda\eta v$ . Hence, taking

[TABLE]

for which ${\mathbb{P}}[\{E^{\eta}(v)\}^{c}]\leq C(d)(\lambda ve^{-2\lambda v(\zeta(d)-\eta)}+e^{-2\lambda\eta v/3})$ , and assuming that $v$ is such that $e^{2\lambda\eta v/3}\geq(d+1)^{3}$ , we have the following consequence of Theorem 2.8. To state it, and for future use, we define

[TABLE]

an upper bound for the times to be considered in proving the central limit theorem.

Corollary 2.10

For any $0<\eta<\zeta(d)$ and $v\leq t_{{\rm max}}(\Lambda)$ such that $e^{2\lambda\eta v/3}\geq(d+1)^{3}$ and $\lambda v\geq c_{1*}$ , there are constants $C=C(d,\eta)$ and $C^{\prime}=C^{\prime}(d)$ and an event $E^{\eta}(v)\in\sigma({\widetilde{H}}(v))$ , with ${\mathbb{P}}[\{E^{\eta}(v)\}^{c}]\leq C^{\prime}\lambda ve^{-2\lambda v(\zeta(d)-\eta)}+e^{-2\lambda\eta v/3})$ , such that, for any $u\geq 0$ such that $t_{\Lambda}(u)\leq t_{{\rm max}}(\Lambda)$ ,

[TABLE]

uniformly for all $f\in F_{{\rm BW}}$ .

Taking any $c_{0},\ldots,c_{d}\in{\mathbb{R}}_{+}$ and setting $C(x):=\sum_{l=0}^{d}c_{l}x^{l}$ , we also observe from Lemma 2.1 that

[TABLE]

on $E_{22}^{\eta}(s)$ , the probability of whose complement is bounded in (2.41).

3 The central limit theorem

In this section, the central limit theorem is proved much as outlined in the introduction. With $\sigma^{2}_{L}(v,u):={\rm Var\,}\{L_{t_{\Lambda}(u)}/L\,|\,{\cal F}_{v}\}$ , we show in Lemma 3.2 that

[TABLE]

if $s$ is chosen to be sufficiently long after $v$ . The approximation of ${\mathbb{E}}\{L_{t_{\Lambda}(u)}/L\,|\,{\cal F}_{s}\}$ as a Poisson probability is then accomplished in Lemma 3.3, with an error that is small if ${t_{\Lambda}(u)}-s$ is sufficiently large. Lemmas 3.4–3.6 approximate the mean of the Poisson distribution by successively simpler quantities, and bound the errors involved in the approximations. The combined result of these steps is summarized in Corollary 3.7, showing that, given ${\cal F}_{v}$ , the distribution of $L_{t_{\Lambda}(u)}/L$ is close to that of $\ell(\log[{\hat{c}}_{d}W(s,v)]+u)$ .

Now the normalized difference $e^{\lambda v/2}(W(s,v)-W(v,v))$ can be shown, using Corollary 2.10, to have a normal approximation. Because of the normalization, it is important at this point to check that the approximation errors in the previous steps are all much smaller than $e^{-\lambda v/2}$ ; this places some restrictions on how large $v$ may be. The linearization of the difference $\ell(\log[{\hat{c}}_{d}W(s,v)]+u)-\ell(\log[{\hat{c}}_{d}W(v,v)]+u)$ , needed to show that it is itself approximately normally distributed, is accomplished in Lemma 3.8, and the final result is given in Theorem 3.9.

3.1 Comparisons of processes

The detailed calculations make heavy use of comparisons between a number of processes, that we justify in Lemma 3.1 by realizing them on the same probability spaces. The process ${\cal L}$ itself can be realized by starting with the times $({\bar{\tau}}_{j},\,j\geq 0)$ of the branching process ${\overline{X}}$ , paired with a sequence of independent uniform points $({\overline{P}}_{j},\,j\geq 0)$ of ${\mathcal{C}}$ . This yields a process

[TABLE]

in terms of which we define

[TABLE]

We can then define the set valued process

[TABLE]

obtained by taking the unions of the neighbourhoods generated by $Y(t)$ . The process $Y$ can be augmented to a process ${\widetilde{Y}}$ of quadruples, by including a set of pairs $(K(j),{\overline{Q}}_{j})$ , $j\geq 0$ , where $0\leq K(j)<j$ and ${\overline{Q}}_{j}\in{\mathcal{C}}$ , denoting the subsets from which the long range contacts were made and the positions of the individuals within them: given $Y({\bar{\tau}}_{j}-)$ ,

[TABLE]

and ${\overline{Q}}_{j}$ is then chosen uniformly from the set ${\cal K}({\overline{P}}_{K(j)},{\bar{\tau}}_{j}-{\bar{\tau}}_{K(j)})$ . The process ${\cal L}$ is derived from ${\widetilde{Y}}$ sequentially, by thinning. The pair $({\bar{\tau}}_{j},{\overline{P}}_{j})$ is not included in ${\cal L}$ unless $K(j)=\min\{l\geq 0\colon{\overline{Q}}_{j}\in{\cal K}({\overline{P}}_{l},{\bar{\tau}}_{j}-{\bar{\tau}}_{l})\}$ . This thinning process ensures that, when neighbourhoods overlap in ${\mathcal{C}}$ , only contacts from the neighbourhood that was informed earliest are allowed, ensuring that the rate of long range transmissions from ${\cal L}_{t}$ remains equal to $\rho L_{t}$ . Note that, if ${\overline{P}}_{j}\in{\cal L}_{{\bar{\tau}}_{j}-}$ , the pair $({\bar{\tau}}_{j},{\overline{P}}_{j})$ is included in defining ${\cal L}$ ; however, it is redundant in (1.3), the newly informed individual having previously been informed, and it never contributes to further transmission, because of the definition of the thinning step. The resulting set of times and positions we denote by $((\tau_{j},P_{j}),\,j\geq 0)$ , with

[TABLE]

and ${\cal L}$ is as given by (1.3); it satisfies ${\cal L}_{t}\subset{\overline{\cal L}}_{t}$ , with strict inclusion for all large enough times.

The process ${\overline{\cal L}}$ acts as a tractable upper bound for ${\cal L}$ , and it is useful also to have tractable lower bounds. In particular, when calculating the probability that a neighbourhood ${\cal K}(P,s)$ intersects ${\cal L}_{t}$ , where $s$ is fixed and $P$ is a uniform random point of ${\mathcal{C}}$ , the way in which the neighbourhoods of ${\cal L}_{t}$ intersect one another enters in a complicated way. However, if ${\cal L}_{t}$ happened to consist of a union of non-intersecting neighbourhoods, which were also separated from one another by distance at least $2s$ , then the probability could be deduced by simply adding the intersection probabilities for the individual neighbourhoods. Then, because the neighbourhoods ${\cal K}$ are balls in a geodesic metric space, the probability of two neighbourhoods ${\cal K}(P,s)$ and ${\cal K}(Q,t)$ intersecting, if one or both of $P$ and $Q$ are chosen uniformly and independently in ${\mathcal{C}}$ , is given by

[TABLE]

where ${\nu}_{s+t}$ can be estimated in terms of ${\nu}(s+t)^{d}$ , in view of (1.4). Of course, as $t$ grows, intersections occur in ${\cal L}_{t}$ , but, at least for a while, their effect may not be too large. So the next step is to construct subsets of ${\cal L}_{t}$ with the necessary separation properties, and which are amenable to analysis.

Fix any $s,t>0$ , and thin the process ${\widetilde{Y}}$ to obtain a set valued process ${\cal L}^{s,t}$ as follows. Start with $\tau_{0}^{s,t}=0$ and $P^{s,t}_{0}=P_{0}$ , defining

[TABLE]

let $R^{s,t}_{0}:=\emptyset$ denote the initial set of indices of censored points of ${\widetilde{Y}}$ . Then proceed sequentially. Suppose that the quadruples $(({\bar{\tau}}_{l},{\overline{P}}_{l},K(l),{\overline{Q}}_{l}),\,0\leq l\leq j-1)\subset{\widetilde{Y}}$ have already been considered. If $K(j)\in R^{s,t}_{j-1}$ , set $R^{s,t}_{j}:=R^{s,t}_{j-1}\cup\{j\}$ and proceed to the next quadruple; descendants of censored points are also censored. If not, thin much as in the construction of ${\cal L}$ , except that a point ${\overline{P}}_{j}$ is also thinned if it belongs to $N_{2s+t-{\bar{\tau}}_{j}}({\cal L}^{s,t}_{{\bar{\tau}}_{j}-})$ , where, for $V\subset{\mathcal{C}}$ and $u>0$ ,

[TABLE]

set

[TABLE]

The extra thinning in (3.7) ensures that the neighbourhoods in ${\cal L}^{s,t}_{t}$ are at distance at least $2s$ from one another. If $J^{s,t}_{u}$ denotes the set of indices of the points of ${\widetilde{Y}}$ that enter ${\cal L}^{s,t}$ up to time $u$ , then ${\cal L}_{u}^{s,t}$ consists of disjoint neighbourhoods $({\cal K}({\overline{P}}_{j},u-{\bar{\tau}}_{j}),\,j\in J^{s,t}_{u})$ , and new points are generated at rate $\rho\sum_{j\in J^{s,t}_{u}}{\nu}_{u-{\bar{\tau}}_{j}}(1-\pi^{s,t}_{u})$ , where the censoring probability $\pi^{s,t}_{u}$ is given by

[TABLE]

In our applications, we can find suitably small bounds for $\pi^{s,t}_{u}$ , so that the growth of the numbers of neighbourhoods in ${\cal L}^{s,t}$ is still reasonably close to that of the CMJ process ${\overline{X}}$ . In view of the ‘hard core’ censoring, the points $({\overline{P}}_{j},\,j\in J^{s,t}_{u})$ are no longer independent of one another, but their marginal distribution is still uniform on ${\mathcal{C}}$ if $P_{0}$ is chosen at random. Note also that ${\cal L}^{s,t}_{u}\subset{\cal L}_{u}$ for each $s,t\geq 0$ and $0<u\leq t$ .

We shall also use comparisons between the CMJ process ${\overline{X}}$ and ‘flattened’ versions ${\widehat{X}}_{-}$ , ${\widehat{X}}_{0}$ and ${\widehat{X}}_{+}$ that are of the form discussed in the previous section. We start by noting that, from the inequality (1.4),

[TABLE]

where $t_{{\rm max}}(\Lambda):=\frac{3}{2\lambda}\log\Lambda$ is as in (2.45), and

[TABLE]

Hence, up to time $t_{{\rm max}}(\Lambda)$ , the process ${\overline{X}}$ is stochastically dominated by the flattened process ${\widehat{X}}_{+}$ , defined as in the previous section, having intensity $\rho_{+}:=\rho(1+\eta_{\Lambda})$ per unit volume, and hence growth rate $\lambda_{+}:=\lambda\{1+\eta_{\Lambda}\}^{1/d}$ ; similarly, it stochastically dominates the flattened process ${\widehat{X}}_{-}$ with $\rho_{-}:=\rho(1-\eta_{\Lambda})$ and $\lambda_{-}:=\lambda\{1-\eta_{\Lambda}\}^{1/d}$ . We also define the flattened process ${\widehat{X}}_{0}$ with intensity $\rho$ per unit volume, and with growth rate $\lambda$ . The quantities $M_{j}^{+}$ , $M_{j}^{0}$ and $M_{j}^{-}$ , and their standardized versions $H_{j}^{+}$ , $H_{j}^{0}$ and $H_{j}^{-}$ , correspond to these processes. We make the relationships between the processes precise with the following construction.

Lemma 3.1

Let the successive birth times in the branching processes ${\overline{X}}$ , ${\widehat{X}}_{-}$ , ${\widehat{X}}_{0}$ and ${\widehat{X}}_{+}$ be denoted by $({\bar{\tau}}_{j},{\hat{\tau}}_{j}^{-},{\hat{\tau}}_{j}^{0},{\hat{\tau}}_{j}^{+},\,j\geq 0)$ , respectively, and let $(T_{t},T_{t}^{-},T_{t}^{0},T_{t}^{+})$ denote the sets of birth times up to time $t$ in each of the processes. If, for some $0\leq s<t_{{\rm max}}(\Lambda)$ , $T_{s}^{-}\subset T_{s}\subset T_{s}^{+}$ and $T_{s}^{-}\subset T_{s}^{0}\subset T_{s}^{+}$ , then the processes ${\overline{X}}$ , ${\widehat{X}}_{-}$ , ${\widehat{X}}_{0}$ and ${\widehat{X}}_{+}$ can be defined on the same probability space, in such a way that, for all $s\leq t\leq t_{{\rm max}}(\Lambda)$ ,

[TABLE]

Proof: The birth rate of ${\overline{X}}$ at time $t$ is given by

[TABLE]

and of ${\widehat{X}}_{0}$ by

[TABLE]

with analogous representations for $r({\widehat{X}}_{-},t)$ and $r({\widehat{X}}_{+},t)$ . Thus, for any time $t$ such that

[TABLE]

we have $r({\widehat{X}}_{-},t)\leq r({\overline{X}},t)\leq r({\widehat{X}}_{+},t)$ and $r({\widehat{X}}_{-},t)\leq r({\widehat{X}}_{0},t)\leq r({\widehat{X}}_{+},t)$ . Hence, for $s$ as given, we can construct all four processes on the same probability space, for $s\leq t\leq t_{{\rm max}}(\Lambda)$ , by realizing ${\widehat{X}}_{+}$ on $[s,t_{{\rm max}}(\Lambda)]$ together with an independent sequence of independent random variables $(U_{j},\,j\geq 1)$ uniformly distributed on $[0,1]$ , and then thinning in the following way. At each successive point ${\hat{\tau}}_{j}^{+}>s$ , include it as a point of ${\overline{X}}$ if $U_{j}r({\widehat{X}}_{+},t)\leq r({\overline{X}},t)$ ; similarly, if $U_{j}r({\widehat{X}}_{+},t)\leq r({\widehat{X}}_{-},t)$ , include ${\hat{\tau}}_{j}^{+}$ as a point of ${\widehat{X}}_{-}$ , and if $U_{j}r({\widehat{X}}_{+},t)\leq r({\widehat{X}}_{0},t)$ , include ${\hat{\tau}}_{j}^{+}$ as a point of ${\widehat{X}}_{0}$ . This construction preserves the inclusions (3.12) for all times up to $t_{{\rm max}}(\Lambda)$ , and, because independently thinned Poisson processes are again Poisson processes, also yields the right distributions for the processes ${\overline{X}}$ , ${\widehat{X}}_{0}$ and ${\widehat{X}}_{-}$ .

In what follows, we shall use ${{\cal F}_{t}^{++}}$ to denote the filtration for the combined construction in Lemma 3.1. We shall henceforth only consider times in $[0,t_{{\rm max}}(\Lambda)]$ , and will take $\Lambda$ large enough that

[TABLE]

3.2 Relating the proportion informed to the function $\ell$

The first step in our detailed calculations is to replace $L_{t}/L$ with ${\mathbb{E}}\{L_{t}/L\,|\,{\widetilde{\cal F}}_{s}\}$ , where ${\widetilde{\cal F}}_{s}:=\sigma({\widetilde{Y}}_{u},\,0\leq u\leq s)$ , for suitable $s<t$ ; this conditional expectation is easier to handle. We start by bounding the conditional variance ${\rm Var\,}\{L_{t}/L\,|\,{\widetilde{\cal F}}_{s}\}$ , for suitable values of $s<t$ .

The basis for our argument is given by the observations that

[TABLE]

where $K$ and $K^{\prime}$ are chosen independently and uniformly in ${\mathcal{C}}$ , implying that

[TABLE]

On the other hand,

[TABLE]

where ${\widetilde{\cal L}}_{t,s}^{K}$ denotes the set of all points at time $s$ that, if informed, would inform $K$ by time $t$ . Now, for the gossip process, ${\widetilde{\cal L}}_{t,s}^{K}$ is independent of ${\widetilde{\cal F}}_{s}$ , and has the same distribution as ${\cal L}_{t-s}$ . In view of (3.16), we thus have

[TABLE]

where ${\cal L}_{s}$ is ${\widetilde{\cal F}}_{s}$ -measurable and ${\widetilde{\cal L}}_{t,s}^{K}$ is independent of ${\widetilde{\cal F}}_{s}$ , and

[TABLE]

with ${\widetilde{\cal L}}_{t,s}^{K}$ and ${\widetilde{\cal L}}_{t,s}^{K^{\prime}}$ independent of ${\widetilde{\cal F}}_{s}$ , but not of each other. Indeed, in view of (3.15), it is the extent of their dependence that measures ${\rm Var\,}\{L_{t}/L\,|\,{\widetilde{\cal F}}_{s}\}$ .

Writing $t_{s}:=t-s$ , our argument now involves bounding the differences

[TABLE]

between the probabilities (3.17) and (3.18) and the smaller ones obtained by replacing ${\widetilde{\cal L}}_{t,s}^{K}$ and ${\widetilde{\cal L}}_{t,s}^{K^{\prime}}$ by their related (independent) branching and growth processes ${\overline{\cal L}}^{K}$ and ${\overline{\cal L}}^{K^{\prime}}$ .

These, as observed in the joint construction at the beginning of the section, give rise to stochastically larger sets than ${\widetilde{\cal L}}_{t,s}^{K}$ and ${\widetilde{\cal L}}_{t,s}^{K^{\prime}}$ . If both of the differences (3.19) and (3.20) are smaller than some $\varepsilon$ , then the independence of ${\overline{\cal L}}^{K}$ and ${\overline{\cal L}}^{K^{\prime}}$ immediately implies that ${\rm Var\,}\{L_{t}/L\,|\,{\widetilde{\cal F}}_{s}\}\leq 4\varepsilon$ . Using this strategy, we prove the following lemma.

Lemma 3.2

Under the above assumptions, there is a constant $C_{\ref{AB-variance-lemma}}=C_{\ref{AB-variance-lemma}}(d)$ such that

[TABLE]

Proof: To control the differences (3.19) and (3.20), we begin by running a process ${\widetilde{Y}}^{K}$ , defined following (3.2), until time $t_{s}$ , and thin to obtain ${\widetilde{\cal L}}_{t,s}^{K}$ . As in (3.3), let ${\overline{J}}^{K}_{u}:=\{j\geq 0\colon{\bar{\tau}}_{j}^{K}\leq u\}$ , and set ${\overline{N}}^{K}_{u}:=|{\overline{J}}^{K}_{u}|$ and ${\overline{M}}^{K}_{u}:=\sum_{j\in{\overline{J}}^{K}_{u}}(u-{\bar{\tau}}_{j}^{K})^{d}$ . We then thin ${\widetilde{Y}}^{K}$ further to construct the process $({\cal L}^{0,t_{s},K}(u),\,0\leq u\leq t_{s})$ , by the method used to construct ${\cal L}^{s,t}$ in (3.8).

We now consider the difference

[TABLE]

which is an upper bound for the real quantity (3.19) of interest to us. The quantity $\Delta_{s,t}$ is no larger than the conditional expectation given ${\widetilde{\cal F}}_{s}$ of the number $Z_{t,s}^{K}$ of intersections between censored islands of ${\overline{\cal L}}^{K}_{t_{s}}$ and the islands of ${\cal L}_{s}$ . If an island born in ${\overline{X}}^{K}$ at $u$ is censored, the expected number of censored islands that result at $t_{s}$ is at most $c_{1}e^{\lambda_{+}(t_{s}-u)}$ , by (2.2) and because ${\overline{X}}^{K}$ is stochastically dominated by ${\widehat{X}}_{+}$ . These islands each have radius at most $(t_{s}-u)$ . Hence, given ${\widetilde{\cal F}}_{s}$ , the expected number of intersections resulting from a censored island born at $u$ is at most

[TABLE]

in view of (3.6), (1.4) and (3.13); ${\overline{N}}$ and ${\overline{M}}$ are as in (3.3). Similarly, using (3.9), the conditional probability $\pi_{u}^{0,t_{s},K}$ of an island born in ${\overline{X}}^{K}$ at $u$ being censored for ${\cal L}^{0,t_{s},K}$ , given the history up to $u$ , is bounded above by

[TABLE]

Hence, again using ${\overline{N}}^{K}$ as an upper bound for the number of uncensored islands, and noting that the birth intensity in ${\overline{X}}^{K}$ at time $u$ is at most

[TABLE]

we have

[TABLE]

Now, by (2.2), (2.4) and Cauchy–Schwarz, and because ${\overline{X}}^{K}$ is stochastically dominated by ${\widehat{X}}_{+}$ ,

[TABLE]

Using this in (3.2), and noting that $\lambda_{+}\leq\lambda(1+\eta_{\Lambda})$ and that $\rho{\nu}d!=\lambda^{d+1}$ , gives the following bound for (3.19):

[TABLE]

We now need to bound (3.20). This can be done by introducing a process ${\cal L}^{0,t_{s},K,K^{\prime}}$ , constructed in the same way as ${\cal L}^{0,t_{s},K}$ , but starting from two initial points $K,K^{\prime}$ and using a CMJ process ${\overline{X}}^{K,K^{\prime}}$ , which is the same as using two independent CMJ processes ${\overline{X}}^{K}$ and ${\overline{X}}^{K^{\prime}}$ , by the branching property. Now ${\cal L}^{0,t_{s},K,K^{\prime}}(t_{s})\subset({\widetilde{\cal L}}_{t,s}^{K}\cup{\widetilde{\cal L}}_{t,s}^{K^{\prime}})$ , and the conditional expection given ${\widetilde{\cal F}}_{s}$ of the number $Z_{t,s}^{K,K^{\prime}}$ of intersections between censored islands of ${\overline{X}}^{K,K^{\prime}}_{t_{s}}$ and the islands of ${\cal L}_{s}$ satisfies

[TABLE]

by an argument exactly as before, but for a larger constant $C_{2}(d)$ than $C_{1}(d)$ appearing in (3.23). Since ${\mathbb{E}}\{Z_{t,s}^{K,K^{\prime}}\,|\,{\widetilde{\cal F}}_{s}\}$ is a bound for the difference in (3.20), we have enough to prove the lemma.

Remark. With $s=\alpha_{1}\lambda^{-1}\log\Lambda$ and $t=\alpha_{2}\lambda^{-1}\log\Lambda$ , where $\alpha_{1}<\alpha_{2}\leq 1$ , and since ${\mathbb{E}}(\lambda^{d}{\overline{M}}_{s}+{\overline{N}}_{s})=O(e^{\lambda_{+}s})$ , from (2.4), it follows that ${\rm Var\,}\{L_{t}/L\,|\,{\widetilde{\cal F}}_{s}\}$ is typically of order $O\bigl{(}\Lambda^{2\alpha_{2}-\alpha_{1}-2}(\log\Lambda)^{d}\bigr{)}$ .

Our main interest is in approximating the distribution of $L_{t}/L$ when

[TABLE]

for $u$ fixed. This is because the times $(t_{\Lambda}(u),\,u\in{\mathbb{R}})$ asymptotically represent the period in which $L_{t}/L$ increases from [math] to $1$ . Taking $\alpha_{1}=\alpha<1$ and $\alpha_{2}=1$ in the remark, it follows that ${\rm Var\,}\{L_{t_{\Lambda}(u)}/L\,|\,{\widetilde{\cal F}}_{s}\}$ is typically of order $O(\Lambda^{-\alpha})$ for $s:=\alpha\lambda^{-1}\log\Lambda$ . Now pick $v:=\alpha_{1}\lambda^{-1}\log\Lambda$ and $s:=\alpha_{2}\lambda^{-1}\log\Lambda$ , with $\alpha_{1}<\alpha_{2}<1$ . Then

[TABLE]

in which the latter term, again by the remark, is typically of order $O(\Lambda^{-\alpha_{2}})$ if $t=t_{\Lambda}(u)$ . Supposing that ${\rm Var\,}\{L_{t}/L\,|\,{\widetilde{\cal F}}_{v}\}$ is actually of magnitude $\Lambda^{-\alpha_{1}}$ , this indicates that the conditional distribution of $L_{t}/L$ given ${\widetilde{\cal F}}_{v}$ is essentially that of the conditional distribution of ${\mathbb{E}}(L_{t}/L\,|\,{\widetilde{\cal F}}_{s})$ given ${\widetilde{\cal F}}_{v}$ . So the next step is to examine ${\mathbb{E}}\{(1-L_{t}/L)\,|\,{\widetilde{\cal F}}_{s}\}$ in detail, for $t=t_{\Lambda}(u)$ , and to express it in more amenable form.

The next lemma once again uses the backward branching process ${\overline{\cal L}}^{K}$ from a randomly chosen point $K$ . We define ${\cal F}^{K}_{s,t}:={\widetilde{\cal F}}_{s}\bigvee{\cal F}^{K}_{t-s;0}$ , where ${\cal F}^{K}_{v;0}:=\sigma({\overline{N}}^{K}_{u},\,0\leq u\leq v)$ contains the information about when the islands of ${\overline{\cal L}}^{K}$ were formed, up to time $v$ , but not where they are centred. We then write $Z^{s,t}$ for the number of islands of ${\overline{\cal L}}^{K}_{t_{s}}$ that intersect ${\cal L}_{s}$ .

Lemma 3.3

With the definitions above, there is a constant $C_{\ref{AB-first-representation}}=C_{\ref{AB-first-representation}}(d)$ such that

[TABLE]

where $M^{K}_{s,t}:={\mathbb{E}}\{Z^{s,t}\,|\,{\cal F}^{K}_{s,t}\}$ .

Proof: We start by using (3.14), (3.16) and (3.23) to show that, for $t>s$ ,

[TABLE]

We now use Poisson approximation to approximate the probability ${\mathbb{P}}[{\overline{\cal L}}^{K}_{t_{s}}\cap{\cal L}_{s}=\emptyset\,|\,{\widetilde{\cal F}}_{s}]$ , using the conditional independence between the locations of the islands of ${\overline{\cal L}}^{K}_{t_{s}}$ , given ${\cal F}^{K}_{s,t}$ , as the basis of the approximation.

We first observe that the conditional probability that an island of ${\overline{\cal L}}^{K}_{t_{s}}$ with radius $v$ intersects ${\cal L}_{s}$ , given ${\cal F}^{K}_{s,t}$ , is at most

[TABLE]

in view of (3.6), by (1.4), (3.11) and (3.13), and because $v\leq t-s$ . This, using $Z^{s,t}$ to denote the number of islands of ${\overline{\cal L}}^{K}_{t_{s}}$ that intersect ${\cal L}_{s}$ , implies that

[TABLE]

by Barbour, Holst & Janson (1992, (1.23)), where $M^{K}_{s,t}:={\mathbb{E}}\{Z^{s,t}\,|\,{\cal F}^{K}_{s,t}\}$ . Hence, from (3.28),

[TABLE]

and combining this with (3.26) gives the lemma.

We now define

[TABLE]

as an approximation to $M^{K}_{s,t}$ . The following lemma bounds the accuracy of the approximation for $t=t_{\Lambda}(u)$ .

Lemma 3.4

For any $\gamma>0$ , there is an event $B_{\ref{AB-M-approx}}(\gamma,s)\in{\widetilde{\cal F}}_{s}$ with ${\mathbb{P}}[\{B_{\ref{AB-M-approx}}(\gamma,s)\}^{c}]\leq C_{\ref{AB-M-approx}}\Lambda^{-\gamma}$ such that, for $t=t_{\Lambda}(u)$ ,

[TABLE]

where $C_{\ref{AB-M-approx}}$ and $C^{\prime}_{\ref{AB-M-approx}}$ depend only on $d$ .

Proof: We begin by introducing the censored version ${\overline{\cal L}}^{s,s}$ of the process ${\overline{\cal L}}$ . We denote the indices of islands in ${\overline{\cal L}}^{s,s}_{s}$ by $J^{s,s}_{s}\subset J_{s}$ , and write $r_{js}:=s-{\bar{\tau}}_{j}$ . It then follows that

[TABLE]

with the lower bound using the separation between the islands of ${\overline{\cal L}}^{s,s}$ . Now, from (3.10), (3.11) and (3.30),

[TABLE]

and

[TABLE]

Hence

[TABLE]

and

[TABLE]

This implies that

[TABLE]

where ${N}^{s,s}_{s}:=|J^{s,s}_{s}|$ . Thus we need to bound the conditional expectation given ${\widetilde{\cal F}}_{s}$ of the right hand side of (3.33).

Define $B_{1}(\gamma,s)$ by

[TABLE]

Since ${\overline{\cal L}}^{K}$ is independent of ${\cal L}$ in (3.33), it follows that we can easily take the expectation, given ${\widetilde{\cal F}}_{s}$ , of its second term. For $t=t_{\Lambda}(u)$ , and using (2.2) and (2.4), this gives

[TABLE]

where we have twice used $e^{(\lambda_{+}-\lambda)t}\leq 2$ for $t\leq t_{{\rm max}}(\Lambda)$ , as follows from (3.13). For the first term in (3.33), from (3.29), we have

[TABLE]

Defining

[TABLE]

it thus follows from the independence of ${\cal L}$ and ${\overline{\cal L}}^{K}$ that, for $t=t_{\Lambda}(u)$ ,

[TABLE]

using (3.13) to bound $e^{(\lambda_{+}-\lambda)t}$ .

To complete the proof of the lemma, we need to show that

[TABLE]

For ${\mathbb{P}}[(B_{1}(\gamma,s))^{c}]$ , we bound ${\mathbb{E}}\{{\overline{N}}_{s}-{N}^{s,s}_{s}\}$ and ${\mathbb{E}}\bigl{\{}\sum_{j\in{\overline{J}}_{s}\setminus J^{s,s}_{s}}r_{js}^{d}\bigr{\}}$ , and then use Markov’s inequality. We begin by bounding the conditional probability $\pi^{s,s}_{u}$ , given the past up to time $u-<s$ , that an island of ${\overline{X}}$ , born to an uncensored parent at $u$ , is censored in ${\overline{\cal L}}^{s,s}$ . Using (3.9), it is no greater than

[TABLE]

If it is censored, bounding ${\overline{X}}^{K}$ by the branching process ${\widehat{X}}_{+}$ and using (2.2) and (2.4), the expected number of its offspring by time $s$ , all of which are also censored, is at most $c_{1}e^{\lambda_{+}(s-u)}$ , and the expected volume censored at most $c_{1}d!\lambda_{+}^{-d}e^{\lambda_{+}(s-u)}$ . Hence

[TABLE]

again by (2.2) and (2.4), and from Cauchy–Schwarz. Then, by a similar argument,

[TABLE]

Combining (3.37) and (3.38) and using Markov’s inequality, ${\mathbb{P}}[\{B_{1}(\gamma,s)\}^{c}]\leq c\Lambda^{-\gamma}$ , for a constant $c$ depending only on $d$ .

For ${\mathbb{P}}[(B_{2}(\gamma,s))^{c}]$ , we again bound ${\overline{X}}^{K}$ by the branching process ${\widehat{X}}_{+}$ and use (2.2) and (2.4), giving

[TABLE]

Hence, from Markov’s inequality, ${\mathbb{P}}[\{B_{1}(\gamma,s)\}^{c}]\leq c^{\prime}\Lambda^{-\gamma}$ , for a constant $c^{\prime}$ depending only on $d$ , and the lemma is proved by taking $B_{0}(\gamma,s)=B_{1}(\gamma,s)\cap B_{2}(\gamma,s)$ .

We now replace ${\mathbb{E}}\{\exp(-{\widetilde{M}}^{K}_{s,t})\,|\,{\widetilde{\cal F}}_{s}\}$ by an expression involving the function $\ell$ defined in (1.12), and using the quantity $W^{*}(s)$ defined by

[TABLE]

where the inequality follows from Lemma 3.1, so that, from (3.13), ${\mathbb{E}}W^{*}(s)\ \leq\ 2$ .

Lemma 3.5

Take $s\leq\lambda^{-1}\log\Lambda$ , and let ${\widetilde{M}}^{K}_{s,t}$ be defined as in (3.29), $W^{*}(s)$ as in (3.40) and $\ell$ as for Lemma 1.12. Then, for any $\gamma>0$ and $0<\eta<\zeta(d)$ , there is an event $B_{\ref{AB-e-M-approx}}(\gamma,\eta,s)\in{\widetilde{\cal F}}_{s}$ and constants $C_{\ref{AB-e-M-approx}}$ and $C^{\prime}_{\ref{AB-e-M-approx}}$ , depending only on $d$ , such that

[TABLE]

and that

[TABLE]

uniformly in $t_{\Lambda}(u)\leq t_{{\rm max}}(\Lambda)$ , where ${\hat{c}}_{d}:=d!/(d+1)$ .

Proof: We first observe, from (3.29) and (1.6) that

[TABLE]

with $r_{js}:=s-{\bar{\tau}}_{j}$ as before. Now realize ${\widehat{X}}_{-}$ , ${\overline{X}}$ and ${\widehat{X}}^{+}$ together as in Lemma 3.1, so that

[TABLE]

Then, for such $s$ , it follows from (2.47), then using Lemma 4.2, (2.40) and (2.41), that, on an event $B_{1}^{+}(\eta,s)\in{{\cal F}_{s}^{++}}$ such that ${\mathbb{P}}[\{B_{1}^{+}(\eta,s)\}^{c}]\leq c(d)(1+\lambda s)e^{-2\lambda(\zeta(d)-\eta)s}$ , we have

[TABLE]

and

[TABLE]

for all choices of $c_{0},\ldots,c_{d}$ , where $C(1):=\sum_{l=0}^{d}c_{l}$ . Define

[TABLE]

Then, on $B_{1}^{+}(\eta,s)\cap B_{2}^{+}(\gamma,s)$ and for $0\leq s\leq t_{{\rm max}}(\Lambda)$ , we have

[TABLE]

for all $0\leq l\leq d$ , from (3.42), (3.43), (3.40) and (3.45), where

[TABLE]

Arguing analogously, we also deduce that

[TABLE]

Now ${\mathbb{P}}[\{B_{1}^{+}(\eta,s)\}^{c}]\leq c(d)(1+\lambda s)e^{-2\lambda(\zeta(d)-\eta)s}$ . Then, since

[TABLE]

and using (3.13), we have

[TABLE]

in $0\leq s\leq t_{{\rm max}}(\Lambda)$ , and hence, by Markov’s inequality,

[TABLE]

Thus the event

[TABLE]

is such that

[TABLE]

for a suitable constant $C_{1}(d)$ .

Now, taking $c_{l}:=\Lambda^{-1}C_{l}(s,t)$ , where

[TABLE]

(3.41) implies that

[TABLE]

Hence also

[TABLE]

Now, because ${\overline{X}}^{K}$ can also be bounded between copies ${\widehat{X}}_{-}^{K}$ and ${\widehat{X}}_{+}^{K}$ of ${\widehat{X}}_{-}$ and ${\widehat{X}}_{+}$ , using Lemma 3.1, we have the inequality

[TABLE]

Hence, since the $K$ -processes can be chosen to be independent of ${\widetilde{\cal F}}_{s}$ , it follows that

[TABLE]

for any $0<s\leq t\leq t_{{\rm max}}(\Lambda)$ . Thus, from (3.51), it follows that

[TABLE]

The next step is to examine the difference

[TABLE]

where $\phi^{1}_{s}(\theta):={\mathbb{E}}\{e^{-\theta W^{1}(s)}\}$ . To start with, from (3.52) and Lemma 2.1,

[TABLE]

Hence, for any non-negative and ${\widetilde{\cal F}}_{s}$ -measurable random variable $\Theta_{s}$ , we have

[TABLE]

where

[TABLE]

and $\phi^{1}$ is as above, with the final equalities a consequence of (2.14). Since $\lambda(1-\eta_{\Lambda})\leq\lambda_{-}\leq\lambda_{+}\leq\lambda(1+\eta_{\Lambda})$ , we conclude from Lemma 4.2 and (3.13) that

[TABLE]

as long as $t\leq t_{{\rm max}}(\Lambda)$ . Taking $\Theta(s):=\{(d+1)\Lambda\}^{-1}e^{\lambda s}W^{*}(s)$ and $t=t_{\Lambda}(u)$ , and using (3.55), (3.57) and (3.13), this gives

[TABLE]

From (3.40), we have ${\mathbb{E}}\{W^{*}(s)\}\leq 2$ . Thus, defining

[TABLE]

it follows that ${\mathbb{P}}[\{B_{4}(\gamma,s)\}^{c}]\leq 2\Lambda^{-\gamma})$ , and, combining (3.54) and (3.58), that

[TABLE]

uniformly in $t_{\Lambda}(u)\leq t_{{\rm max}}(\Lambda)$ . But now, from Lemma 4.2, on the event $B_{4}(\gamma,s)$ ,

[TABLE]

and $\phi^{1}_{\infty}({\hat{c}}_{d}e^{u}W^{*}(s))=1-\ell(\log({\hat{c}}_{d}W^{*}(s))+u)$ by (1.12), (2.14) and (2.18). This establishes the lemma, with $B_{\ref{AB-e-M-approx}}(\gamma,\eta,s):=B_{3}(\gamma,\eta,s)\cap B_{4}(\gamma,s)$ , in view of (3.49) and (3.59).

3.3 Replacing $W^{*}(s)$ by $W(s,v)$

Our aim is to approximate the conditional distribution of $L_{t_{\Lambda}(u)}/L$ , given ${\widetilde{\cal F}}_{v}$ , for suitably chosen $v$ . After Lemma 3.5, the problem has largely been reduced to considering the conditional distribution of $W^{*}(s)$ . However, in order to use the results of Section 2, it is advantageous to replace $W^{*}(s)$ by a function of a flattened branching process; $W^{*}(s)$ is constructed from the birth times ${\bar{\tau}}_{j}$ of the original branching process ${\overline{X}}$ . Accordingly, we define

[TABLE]

for $H_{l}^{0}(\cdot,v)$ , $0\leq l\leq d$ , corresponding to the (flattened) branching process ${\widehat{X}}_{0}$ of Lemma 3.1, taken to have initial condition $H_{l}^{0}(0,v)=\sum_{j\in{\overline{J}}_{v}}(\lambda(v-{\bar{\tau}}_{j}))^{l}/l!\in\sigma({\widetilde{H}}(v))$ , $0\leq l\leq d$ . Note that $W(v,v)=W^{*}(v)$ . The error involved in replacing $W^{*}(s)$ by $W(s,v)$ is bounded in the following lemma.

Lemma 3.6

For $v\leq s\leq\lambda^{-1}\log\Lambda$ , we have

[TABLE]

Proof: We once more use Lemma 3.1 to justify that both $W^{*}(s)$ and $W(s,v)$ belong to the interval

[TABLE]

where the processes ${\widehat{X}}_{-}(\cdot,v)$ and ${\widehat{X}}_{+}(\cdot,v)$ both have the same initial condition as ${\widehat{X}}_{0}(\cdot,v)$ . Now

[TABLE]

and

[TABLE]

hence

[TABLE]

by (3.13). This, together with (1.12) and Lemma 4.2, implies that

[TABLE]

as required.

We now combine the results of Lemmas 3.2–3.6 to give the following result, relating the distribution of $L_{t_{\Lambda}(u)}/L$ to that of $\ell(\log[{\hat{c}}_{d}e^{u}W(s,v)]+u)$ .

Corollary 3.7

Take $v:=\alpha_{1}\lambda^{-1}\log\Lambda$ and $s:=\alpha_{2}\lambda^{-1}\log\Lambda$ for $0<\alpha_{1}<\alpha_{2}<1$ , and fix $0<\eta<\zeta(d)$ . Then there is an event $B_{\ref{summary-1}}(\gamma,\eta,v)\in{\widetilde{\cal F}}_{v}$ , and constants $C^{0}_{\ref{summary-1}}:=C^{0}_{\ref{summary-1}}(u_{0},d)$ , $C^{1}_{\ref{summary-1}}:=C^{1}_{\ref{summary-1}}(u_{0},d)$ and $C^{2}_{\ref{summary-1}}:=C^{2}_{\ref{summary-1}}(d)$ , such that

[TABLE]

and such that ${\mathbb{P}}[\{B_{\ref{summary-1}}(\gamma,\eta,v)\}^{c}]\ \leq\ C^{2}_{\ref{summary-1}}p_{\Lambda}$ , where

[TABLE]

Proof: We take the results of Lemmas 3.2–3.6 in turn. Using Lemma 3.1, we have

[TABLE]

Define the event $B_{\ref{summary-1}}^{(1)}(\gamma,v):=\{W^{*}(v)\leq\Lambda^{\gamma}\}$ , whose probability is at most $\Lambda^{-\gamma}$ , by Markov’s inequality. Then, from Lemma 3.2 and (3.63), it follows that

[TABLE]

implying that, on $B_{\ref{summary-1}}^{(1)}(\gamma,v)$ , we have

[TABLE]

Next, from Lemma 3.3 and (3.63) and on the event $B_{\ref{summary-1}}^{(1)}(\gamma,v)$ , we have

[TABLE]

Turning to Lemma 3.4, we find that

[TABLE]

Then, from Lemma 3.5, we have

[TABLE]

Finally, from Lemma 3.6, on the event $B_{\ref{summary-1}}^{(1)}(\gamma,v)$ , we have

[TABLE]

Combining (3.64) to (3.68), we deduce that, on the event $B_{\ref{summary-1}}^{(1)}(\gamma,v)$ , and uniformly in $u\leq u_{0}$ ,

[TABLE]

where ${\widehat{B}}(\gamma,\eta,s):=B_{\ref{AB-e-M-approx}}(\gamma,\eta,s)\cap B_{\ref{AB-M-approx}}(\gamma,s)$ .

For the exceptional set, from Lemmas 3.5 and 3.4, we have

[TABLE]

On the other hand, for any set $B\in{\cal F}$ with ${\mathbb{P}}[B]=p$ , and for any $\sigma$ -field ${\mathcal{G}}\subset{\cal F}$ , we have

[TABLE]

by the total probability formula, implying that ${\mathbb{P}}[B\,|\,{\mathcal{G}}]\leq\sqrt{p}$ with probability at least $1-\sqrt{p}$ . Hence there is an event $B_{\ref{summary-1}}^{(2)}(\gamma,\eta,v)\in{\widetilde{\cal F}}_{v}$ , whose complement has probability at most

[TABLE]

on which ${\mathbb{P}}[\{{\widehat{B}}(\gamma,\eta,s)\}^{c}\,|\,{\widetilde{\cal F}}_{v}]\leq(C_{e}(d))^{1/2}p_{\Lambda}$ . Now define $Z_{u}:=\ell(\log[{\hat{c}}_{d}W(s,v)]+u)$ and $Y_{u}:=L_{t_{\Lambda}(u)}/L$ . Then, for any bounded Lipschitz function $f$ , we conclude from (3.69) and (3.70) that, for $u\leq u_{0}$ and on the event

[TABLE]

we have

[TABLE]

This proves the corollary.

3.4 The main theorem

We now use Corollary 3.7 to compare the conditional distributions, given ${\widetilde{\cal F}}_{v}$ , of the normalized random variables $Y(u,v)$ and $Z(u,v)$ , where

[TABLE]

for a careful choice of $s$ , with the centring constant $\ell(\log[{\hat{c}}_{d}W^{*}(v)]+u)$ chosen because $W^{*}(v)={\mathbb{E}}\{W(s,v)\,|\,{\widetilde{\cal F}}_{v}\}$ . These are the correct standardizations to achieve a non-trivial limit. Thus we wish to compare ${\mathbb{E}}f(Y(u,v))$ with ${\mathbb{E}}f(Z(u,v))$ , for Lipschitz functions $f$ that have $\|f\|_{\infty}\leq 1$ and $\|f^{\prime}\|_{\infty}\leq 1$ . This corresponds to taking $\|f\|_{\infty}\leq 1$ and $\|f^{\prime}\|_{\infty}\leq e^{\lambda v/2}$ in Corollary 3.7, because of the pre-factors $e^{\lambda v/2}$ in the definitions of $Y(u,v)$ and $Z(u,v)$ . Thus, although $p_{\Lambda}$ is already small for large $\Lambda$ , if $\eta<\zeta(d)$ , we need also to show that, for $v=\alpha_{1}\lambda^{-1}\log\Lambda$ , it is possible to choose $\alpha_{2}$ , $\eta$ and $\gamma$ so as to make $e^{\lambda v/2}\varepsilon_{\Lambda}=\Lambda^{\alpha_{1}/2}\varepsilon_{\Lambda}$ small with $\Lambda$ . Recalling the definition (3.11) of $\eta_{\Lambda}$ , the expression for $\varepsilon_{\Lambda}$ in Corollary 3.7 shows that this is the case, for $\gamma>0$ chosen small enough, if,

[TABLE]

So, for

[TABLE]

choose $0<\eta<\zeta(d)$ so that $2\eta/(1+\eta)>\alpha_{1}$ and then $\alpha_{2}$ so that $\alpha_{1}/(2\eta)<\alpha_{2}<1-\alpha_{1}/2$ ; then, if we choose

[TABLE]

it follows that there are constants $C=C(d,u_{0})$ and $C^{\prime}=C^{\prime}(d)$ such that

[TABLE]

for all $f\in F_{{\rm BW}}$ , except on an event of probability at most $C^{\prime}\{\Lambda^{-\gamma/2}+\Lambda^{-(\zeta(d)-\eta)}\}$ . Particular choices are to take

[TABLE]

in which case we can take any $0<\gamma^{\prime}<\min\{\gamma/2,(\zeta(d)-\eta)\}$ , and express the error in (3.73) as $C\Lambda^{-\gamma^{\prime}}$ , except on an event of probability at most $C^{\prime}\Lambda^{-\gamma^{\prime}}$ , albeit with different constants $C=C(u_{0},d)$ and $C^{\prime}(d)$ .

Corollary 3.7 and (3.73) compare the distribution of $L_{t_{\Lambda}(u)}/L$ with that of the quantity $\ell(\log[{\hat{c}}_{d}W(s,v)]+u)$ , for any $u\leq u_{0}$ . The path of $L_{t_{\Lambda}(u)}/L$ is approximated, to first order, by a time shift of the deterministic path $\ell(u)$ , and the shift is the same throughout the path, being determined by the value of the single ${\widetilde{\cal F}}_{s}$ -measurable random variable $W(s,v)$ . In the remaining argument, we exploit this to show that, to a good approximation, the path after time $v$ is that of the approximation $\ell(\log[{\hat{c}}_{d}W^{*}(v)]+\cdot)$ , together with a perturbation that can be expressed in the form $e^{-\lambda v/2}Nh_{v}(\cdot)$ , where $h_{v}(\cdot)$ is an ${\widetilde{\cal F}}_{v}$ -measurable function depending on the value of $W^{*}(v)$ , and ${\cal L}(N\,|\,{\widetilde{\cal F}}_{v})$ is the standard normal distribution.

To do so, in view of (3.73), we now need a central limit theorem for $Z(u,v)$ as defined in (3.71). Writing

[TABLE]

where the final equality holds for all $k>0$ , the next lemma shows that $Z(u,v)$ is close in distribution to $K_{2}(u,v)\,e^{\lambda v/2}\{W(s,v)-W^{*}(v)\}$ .

Lemma 3.8

Let $Z(u,v)$ be defined as in (3.71), and let $v:=\alpha_{1}\lambda^{-1}\log\Lambda$ and $s:=\alpha_{2}\lambda^{-1}\log\Lambda$ ; suppose that $\gamma$ is as for (3.72) and $\gamma^{\prime}={\textstyle{\frac{1}{2}}}\min\{\gamma/2,(\zeta(d)-\eta)\}$ , where $\eta$ is as in (3.74). Then there is a constant $C=C(d,u_{0})$ such that, for all $f\in F_{{\rm BW}}$ , and on the event $\{W^{*}(v)\leq\Lambda^{\gamma}\}$ ,

[TABLE]

uniformly in $u\leq u_{0}$ .

Proof: From (1.12), we have $g(x):=\ell(\log x)=1-{\mathbb{E}}\{e^{-xW}\}$ , so that, by Taylor’s expansion, for any $x,y>0$ , we can write

[TABLE]

from (2.17). Thus, in making a linear approximation to

[TABLE]

the remainder term can be bounded by ${\textstyle{\frac{1}{2}}}k^{2}(W(s,v)-W^{*}(v))^{2}$ . Now, because $W^{*}(v)={\mathbb{E}}\{W(s,v)\,|\,{\widetilde{\cal F}}_{v}\}$ , we have

[TABLE]

where the inequality follows using (2.17). Hence, for any $k>0$ , and using (3.75), we have

[TABLE]

Thus, taking $k={\hat{c}}_{d}e^{u}$ in (3.76), and on $\{W^{*}(v)\leq\Lambda^{\gamma}\}$ , it follows that

[TABLE]

and the lemma follows because $\gamma^{\prime}+\gamma<3\gamma/2\leq\alpha_{1}/2$ , from (3.72).

We are now in a position to prove a central limit theorem, with an error bound expressed in terms of the bounded Wasserstein distance.

Theorem 3.9

Suppose that $v=\alpha\lambda^{-1}\log\Lambda$ for $0<\alpha<2\min\{\gamma_{g}/d,\zeta(d)/(1+\zeta(d)\}$ , where $\gamma_{g}$ is as in (1.4) and $\zeta(d)$ as in (1.11) (so that $\zeta(d)=1/2$ for $d\leq 6$ ). Suppose that $\gamma$ is as for (3.72) and $\gamma^{\prime}={\textstyle{\frac{1}{2}}}\min\{\gamma/2,(\zeta(d)-\eta^{\prime}),(\alpha_{2}-\alpha)\}$ , where $\eta^{\prime}$ and $\alpha_{2}$ are as in (3.74), with $\alpha_{1}=\alpha$ . Suppose that $\Lambda$ is large enough that (3.13) is satisfied, and that $\Lambda^{4\alpha\zeta(d)/7}\geq(d+1)^{3}$ and $\alpha\log\Lambda>c_{1*}$ , where $c_{1*}$ is as in Theorem 2.8. Then, for any $u_{1}<u_{0}\in{\mathbb{R}}$ , there exist constants $C(d,u_{1},u_{0})$ and $C^{\prime}(d,u_{1},u_{0})$ and an event $E^{*}(v)\in\sigma({\widetilde{H}}(v))$ with ${\mathbb{P}}[E^{*}(v)^{c}]\leq C^{\prime}(d,u_{1},u_{0})\Lambda^{-\gamma^{\prime}}$ such that

[TABLE]

uniformly in $u_{1}\leq u\leq u_{0}$ , where $K_{2}(u,v)$ is defined in (3.75), ${\hat{c}}_{d}$ in Lemma 3.5 and $t_{\Lambda}(u)$ in (3.25).

Proof: In view of (3.73) and Lemma 3.8, it suffices to show that

[TABLE]

with $s=\alpha_{2}\lambda^{-1}\log\Lambda$ and $\alpha_{2}$ as in (3.74). Corollary 2.10, with $\eta=6\zeta(d)/7$ , shows that there is an event $E^{\eta}(v)\in{\widetilde{H}}(v)$ with ${\mathbb{P}}[\{E^{\eta}(v)\}^{c}]\leq C^{\prime}(d)\Lambda^{-2\alpha\zeta(d)/7}$ such that, on $E^{\eta}(v)$ ,

[TABLE]

provided that $\Lambda^{4\alpha\zeta(d)/7}\geq(d+1)^{3}$ . Then, from (2.25) and (2.36),

[TABLE]

and the theorem follows because $d_{{\rm BW}}({\cal N}(0,\sigma_{1}^{2}),{\cal N}(0,\sigma_{2}^{2}))=O(|\sigma_{1}-\sigma_{2}|)$ and

[TABLE]

on $\{W^{*}(v)\leq\Lambda^{\gamma^{\prime}}\}$ , and $\gamma^{\prime}\leq{\textstyle{\frac{1}{2}}}(\alpha_{2}-\alpha)$ , from (3.72).

This theorem is not quite the same as Theorem 1.1, because both mean and variance are expressed in terms of $W^{*}(v)=W(v,v)$ , which, as is seen from its definition in (3.40), is not necessarily determined by knowledge of ${\cal L}_{v}$ alone, because all the birth times of ${\overline{X}}$ come into its definition. Instead, one can observe ${\widehat{W}}(v)$ as in (1.10). We now show that this is enough.

We construct a lower bound ${\widehat{W}}_{-}(v)$ for ${\widehat{W}}(v)$ by summing over the subset of the birth times ${\widehat{J}}_{v}\subset J_{v}$ in (1.10) that belong to $J_{v}\cap{\widetilde{J}}_{v}$ , where $J_{v}$ is defined in (3.5), and

[TABLE]

with ${\overline{J}}_{v}$ the birth times of ${\overline{X}}$ before $v$ , defined in (3.3). These give rise to non-intersecting neighbourhoods at time $v$ , though not necessarily to all such, and they form a subset more amenable to calculation. Then it is immediate from (1.4) that, for all $\Lambda$ sufficiently large,

[TABLE]

the final inequality following from (2.2). Then, using arguments analogous to those in Lemma 3.2, we have

[TABLE]

Hence, for $v\leq t_{{\rm max}}(\Lambda)$ ,

[TABLE]

and, for $v=\alpha\lambda^{-1}\log\Lambda$ , this is of order $O(\Lambda^{-1+\alpha}(\log\Lambda)^{2d})$ . The most sensitive place where this enters is into $\ell(\log[{\hat{c}}_{d}W^{*}(v)]+u)$ , when the difference has to be small relative to $\Lambda^{-\alpha/2}$ , because of the factor $e^{\lambda v/2}$ ; but this is the case if $\alpha<2/3$ , as in the statement of the theorem, by Lemma 4.2. The conversion of $E^{*}(v)$ into an event that can be determined from ${\cal L}_{v}$ can be accomplished in similar fashion, by modifying the definitions of its constituent events in terms of $W_{j}(v)$ , $0\leq j\leq v$ .

Appendix

We note here two technical lemmas that are used in the previous arguments. The first establishes a bound on the extreme fluctuations of an integral with respect to a compensated Poisson process.

Lemma 4.1

Let $X(t):=\int_{0}^{t}F(u)\{Z(du)-du\}$ , where $Z$ is a Poisson process and the process $F$ is predictable and a.s. bounded in modulus by the deterministic function $G$ . Define $G_{2}(s,t):=\int_{s}^{t}\{G(u)\}^{2}\,du$ and $G^{*}(s,t):=\sup_{s\leq u\leq t}G(u)$ . Then

[TABLE]

for all $0\leq a\leq eG_{2}(t_{1},t_{2})/G^{*}(t_{1},t_{2})$ . If $G$ is decreasing, we have

[TABLE]

for all $0\leq a\leq eG(t_{1})(t_{2}-t_{1})$ .

Proof: For any $\theta$ , the process

[TABLE]

is a supermartingale (van de Geer (1995, p. 1795)), and stopping at $a$ easily yields

[TABLE]

if $0\leq\theta G^{*}(t_{1},t_{2})\leq 1$ . The corresponding bound for $\inf_{t_{1}\leq t\leq t_{2}}(X(t)-X(t_{1}))$ is proved in analogous fashion. Now, if $a\leq eG_{2}(t_{1},t_{2})$ , choose $\theta=a/\{eG_{2}(t_{1},t_{2})\}$ , giving the first conclusion of the lemma. The second follows by choosing $\theta=a/\{eG(t_{1})^{2}(t_{2}-t_{1})\}$ .

The second lemma establishes some smoothness of the function $\phi^{1}_{s}(\theta):={\mathbb{E}}\{e^{-\theta W^{1}(s)}\}$ .

Lemma 4.2

With $\phi^{1}_{s}$ defined as above, and for any $s,h,\theta>0$ , we have

[TABLE]

Proof: We note that $W^{1}(s)\geq 0$ and that ${\mathbb{E}}W^{1}(s)=1$ for all $s$ . Then, writing $X_{s}(h):=W^{1}(s+h)-W^{1}(s)$ and using (2.17), we have

[TABLE]

for any $s,h>0$ . Hence, using (4.1), and taking expectations first conditional on ${\widehat{\cal F}}_{s}$ , we have

[TABLE]

This implies that

[TABLE]

since $xe^{-x}\leq e^{-1}$ , proving the first inequality.

For the second, since $e^{-x}(1-e^{-\delta x})\leq\delta e^{-1}$ in $x\geq 0$ and ${\mathbb{E}}W^{1}(s)=1$ ,

[TABLE]

Acknowledgement

ADB thanks the Department of Statistics and Applied Probability at the National University of Singapore, and the mathematics departments of the University of Melbourne and Monash University, for their kind hospitality while much of the work was undertaken. AR thanks the School of Mathematics and Statistics at the University of Melbourne for their kind hospitality.

Bibliography13

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] D. J. Aldous (2012) When knowing early matters: gossip, percolation and Nash equilibria. In: Prokhorov and Contemporary Probability Theory , Eds. A.N. Shiryaev, S.R.S. Varadhan and E.L. Presman, pp. 3–28. Springer Proceedings in Mathematics & Statistics 33 , Springer Verlag, Heidelberg.
2[2] S. Asmussen & H. Hering (1983) Branching processes. Progress in probability and statistics 3, Birkhäuser, Boston.
3[3] F. G. Ball (1983) The threshold behaviour of epidemic models. J. Appl. Probab. 20 , 227–241.
4[4] A. D. Barbour, L. Holst & S. Janson (1992) Poisson approximation . Oxford University Press.
5[5] A. D. Barbour & G. Reinert (2013). Asymptotic behaviour of gossip processes and small world networks. Adv. Appl. Probab. 45 , 981–1010.
6[6] S. Chatterjee & R. Durrett (2011) Asymptotic behavior of Aldous’ gossip process. Ann. Appl. Probab. 21 , 2447–2482.
7[7] E. Z. Ganuza & S. D. Durham (1974) Mean–square and almost–sure convergence of supercritical age–dependent branching processes. J. Appl. Probab. 11 , 678–686.
8[8] S. van de Geer (1995) Exponential inequalities for martingales, with application to maximum likelihood estimation for counting processes. Ann. Statist. 23 , 1779–1801.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

A central limit theorem for the gossip process

Abstract

Keywords.

MRC subject classification.

1 Introduction

1.1 Detailed formulation

Theorem 1.1

2 The branching process

2.1 Properties of the flattened process

Lemma 2.1

Lemma 2.2

2.2 Approximating an integral representation of W(v+t)−W(v)W(v+t)-W(v)W(v+t)−W(v)

Lemma 2.3

Lemma 2.4

Lemma 2.5

Corollary 2.6

Lemma 2.7

Theorem 2.8

2.3 Consequences for the gossip process

Lemma 2.9

Corollary 2.10

3 The central limit theorem

3.1 Comparisons of processes

Lemma 3.1

3.2 Relating the proportion informed to the function ℓ\ellℓ

Lemma 3.2

Lemma 3.3

Lemma 3.4

Lemma 3.5

3.3 Replacing W∗(s)W^{*}(s)W∗(s) by W(s,v)W(s,v)W(s,v)

Lemma 3.6

Corollary 3.7

3.4 The main theorem

Lemma 3.8

Theorem 3.9

Appendix

Lemma 4.1

Lemma 4.2

Acknowledgement

2.2 Approximating an integral representation of $W(v+t)-W(v)$

3.2 Relating the proportion informed to the function $\ell$

3.3 Replacing $W^{*}(s)$ by $W(s,v)$