Recursive tree processes and the mean-field limit of stochastic flows

Tibor Mach; Anja Sturm; Jan M. Swart

arXiv:1812.10787·math.PR·March 19, 2020

Recursive tree processes and the mean-field limit of stochastic flows

Tibor Mach, Anja Sturm, Jan M. Swart

PDF

TL;DR

This paper develops a continuous-time theory for recursive tree processes related to mean-field limits of interacting particle systems, illustrating it with a cooperative branching example that is not endogenous.

Contribution

It introduces a continuous-time analogue for recursive tree processes and analyzes their behavior in the mean-field limit for coupled systems.

Findings

01

Developed a continuous-time recursive tree process theory.

02

Connected recursive tree processes with mean-field limits of particle systems.

03

Provided an example of a non-endogenous recursive tree process.

Abstract

Interacting particle systems can often be constructed from a graphical representation, by applying local maps at the times of associated Poisson processes. This leads to a natural coupling of systems started in different initial states. We consider interacting particle systems on the complete graph in the mean-field limit, i.e., as the number of vertices tends to infinity. We are not only interested in the mean-field limit of a single process, but mainly in how several coupled processes behave in the limit. This turns out to be closely related to recursive tree processes as studied by Aldous and Bandyopadyay in discrete time. We here develop an analogue theory for recursive tree processes in continuous time. We illustrate the abstract theory on an example of a particle system with cooperative branching. This yields an interesting new example of a recursive tree process that is not…

Equations564

T (μ) := \mbox t h e l a w o f γ [ω] (X_{1}, X_{2}, \dots),

T (μ) := \mbox t h e l a w o f γ [ω] (X_{1}, X_{2}, \dots),

{\textstyle\frac{{\partial}}{{\partial{t}}}}\mu_{t}=|{\mathbf{r}}|\big{\{}T(\mu_{t})-\mu_{t}\big{\}}\qquad(t\geq 0).

{\textstyle\frac{{\partial}}{{\partial{t}}}}\mu_{t}=|{\mathbf{r}}|\big{\{}T(\mu_{t})-\mu_{t}\big{\}}\qquad(t\geq 0).

γ [ω] (x_{1}, x_{2}, \dots) = γ [ω] (x_{1}, \dots, x_{κ (ω)}) (ω \in Ω, x \in S^{N_{+}})

γ [ω] (x_{1}, x_{2}, \dots) = γ [ω] (x_{1}, \dots, x_{κ (ω)}) (ω \in Ω, x \in S^{N_{+}})

\int_{Ω} r (d ω) κ (ω) < \infty.

\int_{Ω} r (d ω) κ (ω) < \infty.

K(x,A):={\mathbb{P}}\big{[}\gamma[\bm{\omega}](x)\in A\big{]}\qquad(x\in S,\ A\subset S\mbox{ measurable}).

K(x,A):={\mathbb{P}}\big{[}\gamma[\bm{\omega}](x)\in A\big{]}\qquad(x\in S,\ A\subset S\mbox{ measurable}).

T_{t} (μ) := μ_{t} \mbox w h er e (μ_{t})_{t \geq 0} \mbox so l v es (\ref m e an) w i t h μ_{0} = μ .

T_{t} (μ) := μ_{t} \mbox w h er e (μ_{t})_{t \geq 0} \mbox so l v es (\ref m e an) w i t h μ_{0} = μ .

T_{t}(\mu)=\mbox{ the law of }G_{t}\big{(}(X_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}\big{)},

T_{t}(\mu)=\mbox{ the law of }G_{t}\big{(}(X_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}\big{)},

g^{(n)}\big{(}x_{1},\ldots,x_{k})=g^{(n)}\big{(}x^{1},\ldots,x^{n}):=\big{(}g(x^{1}),\ldots,g(x^{n})\big{)}\qquad(x^{1},\ldots,x^{n}\in S^{k}),

g^{(n)}\big{(}x_{1},\ldots,x_{k})=g^{(n)}\big{(}x^{1},\ldots,x^{n}):=\big{(}g(x^{1}),\ldots,g(x^{n})\big{)}\qquad(x^{1},\ldots,x^{n}\in S^{k}),

T^{(n)} (μ) := \mbox t h e l a w o f γ^{(n)} [ω] (X_{1}, \dots, X_{κ (ω)}),

T^{(n)} (μ) := \mbox t h e l a w o f γ^{(n)} [ω] (X_{1}, \dots, X_{κ (ω)}),

{\cal G}:=\big{\{}\gamma[\omega]:\omega\in\Omega\big{\}}

{\cal G}:=\big{\{}\gamma[\omega]:\omega\in\Omega\big{\}}

{\textstyle\frac{{\partial}}{{\partial{t}}}}\mu_{t}=\int_{\cal G}\pi(\mathrm{d}g)\big{\{}T_{g}(\mu_{t})-\mu_{t}\big{\}}\qquad(t\geq 0),

{\textstyle\frac{{\partial}}{{\partial{t}}}}\mu_{t}=\int_{\cal G}\pi(\mathrm{d}g)\big{\{}T_{g}(\mu_{t})-\mu_{t}\big{\}}\qquad(t\geq 0),

T_{g} (μ) := \mbox t h e l a w o f g (X_{1}, \dots, X_{k}), \mbox w h er e (X_{i})_{i = 1, \dots, k} \mbox a r e i . i . d . w i t h l a w μ .

T_{g} (μ) := \mbox t h e l a w o f g (X_{1}, \dots, X_{k}), \mbox w h er e (X_{i})_{i = 1, \dots, k} \mbox a r e i . i . d . w i t h l a w μ .

\mbox cob (x_{1}, x_{2}, x_{3}) := x_{1} \lor (x_{2} \land x_{3}) \mbox an d \mbox dth (\emptyset) := 0.

\mbox cob (x_{1}, x_{2}, x_{3}) := x_{1} \lor (x_{2} \land x_{3}) \mbox an d \mbox dth (\emptyset) := 0.

\pi\big{(}\{{\mbox{\tt cob}}\}\big{)}:=\alpha\geq 0\quad\mbox{and}\quad\pi\big{(}\{{\mbox{\tt dth}}\}\big{)}:=1.

\pi\big{(}\{{\mbox{\tt cob}}\}\big{)}:=\alpha\geq 0\quad\mbox{and}\quad\pi\big{(}\{{\mbox{\tt dth}}\}\big{)}:=1.

{\textstyle\frac{{\partial}}{{\partial{t}}}}\mu_{t}=\alpha\big{\{}T_{\mbox{\tt cob}}(\mu_{t})-\mu_{t}\big{\}}+\big{\{}T_{\mbox{\tt dth}}(\mu_{t})-\mu_{t}\big{\}}.

{\textstyle\frac{{\partial}}{{\partial{t}}}}\mu_{t}=\alpha\big{\{}T_{\mbox{\tt cob}}(\mu_{t})-\mu_{t}\big{\}}+\big{\{}T_{\mbox{\tt dth}}(\mu_{t})-\mu_{t}\big{\}}.

{\textstyle\frac{{\partial}}{{\partial{t}}}}\langle\mu_{t},\phi\rangle=\int_{\Omega}{\mathbf{r}}(\mathrm{d}\omega)\big{\{}\langle T_{\gamma[\omega]}(\mu_{t}),\phi\rangle-\langle\mu_{t},\phi\rangle\big{\}}\qquad(t\geq 0).

{\textstyle\frac{{\partial}}{{\partial{t}}}}\langle\mu_{t},\phi\rangle=\int_{\Omega}{\mathbf{r}}(\mathrm{d}\omega)\big{\{}\langle T_{\gamma[\omega]}(\mu_{t}),\phi\rangle-\langle\mu_{t},\phi\rangle\big{\}}\qquad(t\geq 0).

\big{\|}T_{t}(\mu)-T_{t}(\nu)\big{\|}\leq e^{Kt}\|\mu-\nu\|\qquad(\mu,\nu\in{\cal P}(S),\ t\geq 0),

\big{\|}T_{t}(\mu)-T_{t}(\nu)\big{\|}\leq e^{Kt}\|\mu-\nu\|\qquad(\mu,\nu\in{\cal P}(S),\ t\geq 0),

K:=\int_{\Omega}{\mathbf{r}}(\mathrm{d}\omega)\,\big{(}\kappa(\omega)-1\big{)}.

K:=\int_{\Omega}{\mathbf{r}}(\mathrm{d}\omega)\,\big{(}\kappa(\omega)-1\big{)}.

{\mathbf{r}}\big{(}\{\omega:\kappa(\omega)=k,\ \gamma[\omega]\mbox{ is discontinuous at }x\}\big{)}=0\qquad(k\geq 0,\ x\in S^{k}).

{\mathbf{r}}\big{(}\{\omega:\kappa(\omega)=k,\ \gamma[\omega]\mbox{ is discontinuous at }x\}\big{)}=0\qquad(k\geq 0,\ x\in S^{k}).

Ω_{l}^{'} \times S^{l} ∋ (ω, x) \mapsto γ_{i} [ω] (x) \mbox an d Ω_{l}^{'} ∋ ω \mapsto 1_{{j \in K_{i} (ω)}} \mbox a r e m e a s u r ab l e

Ω_{l}^{'} \times S^{l} ∋ (ω, x) \mapsto γ_{i} [ω] (x) \mbox an d Ω_{l}^{'} ∋ ω \mapsto 1_{{j \in K_{i} (ω)}} \mbox a r e m e a s u r ab l e

\vec{\gamma}[\omega](x):=\big{(}\gamma_{1}[\omega](x),\ldots,\gamma_{\lambda(\omega)}[\omega](x)\big{)}

\vec{\gamma}[\omega](x):=\big{(}\gamma_{1}[\omega](x),\ldots,\gamma_{\lambda(\omega)}[\omega](x)\big{)}

{\textstyle\frac{{\partial}}{{\partial{t}}}}\mu_{t}=\int_{\Omega^{\prime}}\!{\mathbf{q}}(\mathrm{d}\omega)\sum_{i=1}^{\lambda(\omega)}\big{\{}T_{\gamma_{i}[\omega]}(\mu_{t})-\mu_{t}\big{\}}.

{\textstyle\frac{{\partial}}{{\partial{t}}}}\mu_{t}=\int_{\Omega^{\prime}}\!{\mathbf{q}}(\mathrm{d}\omega)\sum_{i=1}^{\lambda(\omega)}\big{\{}T_{\gamma_{i}[\omega]}(\mu_{t})-\mu_{t}\big{\}}.

(i) \int_{Ω^{'}} q (d ω) λ (ω) < \infty \mbox an d (ii) \int_{Ω^{'}} q (d ω) i = 1 \sum λ (ω) κ_{i} (ω) < \infty.

(i) \int_{Ω^{'}} q (d ω) λ (ω) < \infty \mbox an d (ii) \int_{Ω^{'}} q (d ω) i = 1 \sum λ (ω) κ_{i} (ω) < \infty.

{\mathbf{q}}\big{(}\{\omega:\lambda(\omega)=l,\ \gamma_{i}[\omega]\mbox{ is discontinuous at x}\}\big{)}=0\qquad(1\leq i\leq l,\ x\in S^{l}),

{\mathbf{q}}\big{(}\{\omega:\lambda(\omega)=l,\ \gamma_{i}[\omega]\mbox{ is discontinuous at x}\}\big{)}=0\qquad(1\leq i\leq l,\ x\in S^{l}),

m_{\omega,\mathbf{i}}(x)_{j}:=\left\{\begin{array}[]{ll}\gamma_{j}[\omega](x_{i_{1}},\ldots,x_{i_{\lambda(\omega)}})\quad\mbox{if }j\in\{i_{1},\ldots,i_{\lambda(\omega)}\},\\ x_{j}\quad\mbox{otherwise,}\end{array}\right.\qquad(x\in S^{N}).

m_{\omega,\mathbf{i}}(x)_{j}:=\left\{\begin{array}[]{ll}\gamma_{j}[\omega](x_{i_{1}},\ldots,x_{i_{\lambda(\omega)}})\quad\mbox{if }j\in\{i_{1},\ldots,i_{\lambda(\omega)}\},\\ x_{j}\quad\mbox{otherwise,}\end{array}\right.\qquad(x\in S^{N}).

\big{\{}(\omega,\mathbf{i},t):\omega\in\Omega^{\prime},\ \mathbf{i}\in[N]^{\langle\lambda(\omega)\rangle},\ t\in{\mathbb{R}}\big{\}}

\big{\{}(\omega,\mathbf{i},t):\omega\in\Omega^{\prime},\ \mathbf{i}\in[N]^{\langle\lambda(\omega)\rangle},\ t\in{\mathbb{R}}\big{\}}

q (d ω) \frac{1 _{{λ (ω) \leq N}}}{N ^{⟨ λ (ω)⟩}} d t .

q (d ω) \frac{1 _{{λ (ω) \leq N}}}{N ^{⟨ λ (ω)⟩}} d t .

\Pi_{s,u}=\big{\{}(\omega_{1},\mathbf{i}_{1},t_{1}),\ldots,(\omega_{n},\mathbf{i}_{n},t_{n})\big{\}}\quad\mbox{with}\quad t_{1}<\cdots<t_{n}

\Pi_{s,u}=\big{\{}(\omega_{1},\mathbf{i}_{1},t_{1}),\ldots,(\omega_{n},\mathbf{i}_{n},t_{n})\big{\}}\quad\mbox{with}\quad t_{1}<\cdots<t_{n}

X_{s, u} = m_{ω_{n}, i_{n}} \circ \dots \circ m_{ω_{1}, i_{1}}

X_{s, u} = m_{ω_{n}, i_{n}} \circ \dots \circ m_{ω_{1}, i_{1}}

X_{s, s} = 1 \mbox an d X_{t, u} \circ X_{s, t} = X_{s, u} (s \leq t \leq u),

X_{s, s} = 1 \mbox an d X_{t, u} \circ X_{s, t} = X_{s, u} (s \leq t \leq u),

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Recursive tree processes and the

mean-field limit of stochastic flows

Tibor Mach 111The Czech Academy of Sciences, Institute of Information Theory and Automation, Pod vodárenskou věží 4, 18208 Praha 8, Czech Republic; [email protected]

Anja Sturm 222Institute for Mathematical Stochastics, Georg-August-Universität Göttingen, Goldschmidtstr. 7, 37077 Göttingen, Germany; [email protected]

Jan M. Swart $\;{}^{\ast}$

Abstract

Interacting particle systems can often be constructed from a graphical representation, by applying local maps at the times of associated Poisson processes. This leads to a natural coupling of systems started in different initial states. We consider interacting particle systems on the complete graph in the mean-field limit, i.e., as the number of vertices tends to infinity. We are not only interested in the mean-field limit of a single process, but mainly in how several coupled processes behave in the limit. This turns out to be closely related to recursive tree processes as studied by Aldous and Bandyopadyay in discrete time. We here develop an analogue theory for recursive tree processes in continuous time. We illustrate the abstract theory on an example of a particle system with cooperative branching. This yields an interesting new example of a recursive tree process that is not endogenous.

MSC 2010. Primary: 82C22, Secondary: 60J25, 60J80, 60K35.

Keywords. Mean-field limit, recursive tree process, recursive distributional equation, endogeny, interacting particle systems, cooperative branching.

Acknowledgement. Work sponsored by grant 16-15238S of the Czech Science Foundation (GA CR) and by grant STU 527/1-2 of the German Research Foundation (DFG) within the Priority Programme 1590 “Probabilistic Structures in Evolution”.

1 Introduction and main results
1.1 Introduction
1.2 The mean-field equation
1.3 The mean-field limit
1.4 A recursive tree representation
1.5 Recursive tree processes
1.6 Endogeny and bivariate uniqueness
1.7 The higher-level mean-field equation
1.8 Lower and upper solutions
1.9 Conditions for uniqueness
2 Discussion
2.1 A Moran model with frequency-dependent selection
2.2 Mean-field limits
2.3 Open problems
2.4 Outline of the proofs
3 The mean-field equation
3.1 Preliminaries
3.2 Uniqueness
3.3 The stochastic representation
3.4 Continuity in the initial state
4 Approximation by finite systems
4.1 Main line of the proof
4.2 The state at sampled sites
4.3 Tightness in total variation
4.4 Convergence to the mean-field equation
5 Recursive Tree Processes
5.1 Construction of RTPs
5.2 Continuous-time RTPs
5.3 Endogeny, bivariate uniqueness, and the higher-level equation
6 Further results
6.1 Monotonicity
6.2 Conditions for uniqueness
6.3 Duality
7 Cooperative branching
7.1 The bivariate mean-field equation
7.2 The higher-level mean-field equation
7.3 Root-determining and open subtrees

1 Introduction and main results

1.1 Introduction

Let $\Omega$ and $S$ be Polish spaces, let ${\mathbf{r}}$ be a finite measure on $\Omega$ with total mass $|{\mathbf{r}}|:={\mathbf{r}}(\Omega)>0$ , and let $\gamma:\Omega\times S^{{\mathbb{N}}_{+}}\to S$ be measurable, where ${\mathbb{N}}_{+}:=\{1,2,\dots\}$ . Let $T$ be the operator acting on probability measures on $S$ defined as

[TABLE]

where $\bm{\omega}$ is an $\Omega$ -valued random variable with law $|{\mathbf{r}}|^{-1}{\mathbf{r}}$ and $(X_{i})_{i\geq 1}$ are i.i.d. with law $\mu$ . In this paper, we will be interested in the differential equation

[TABLE]

In Theorem 1 below, we will prove existence and uniqueness of solutions to (1.2) under the assumption that there exists a measurable function $\kappa:\Omega\to{\mathbb{N}}$ such that

[TABLE]

depends only on the first $\kappa(\omega)\in{\mathbb{N}}$ coordinates, and

[TABLE]

Our interest in equation (1.2) stems from the fact that, as we will prove in Theorem 5 below, the mean-field limits of a large class of interacting particle systems are described by equations of the form (1.2). In view of this, we call (1.2) a mean-field equation. The analysis of this sort of equations is commonly the first step towards understanding a given interacting particle system. Some illustrative examples of mean-field equations in the literature are [DN97, (1.1)], [NP99, (1.2)], and [FL17, (4)].

In the special case that $\kappa(\omega)=1$ for all $\omega\in\Omega$ , we observe that $T(\mu)=\int_{S}\mu(\mathrm{d}x)K(x,\,\cdot\,)$ , where $K$ is the probability kernel on $S$ defined as

[TABLE]

In view of this, if $\bm{\omega}_{1},\bm{\omega}_{2},\ldots$ are i.i.d. with law $|{\mathbf{r}}|^{-1}{\mathbf{r}}$ and $X_{0}$ has law $\mu$ , then setting $X_{k}:=\gamma[\bm{\omega}_{k}](X_{k-1})$ $(k\geq 1)$ inductively defines a Markov chain with transition kernel $K$ , such that $X_{k}$ has law $T^{k}(\mu)$ , where $T^{k}$ denotes the $k$ -th iterate of the map $T$ . Also, (1.2) describes the forward evolution of a continuous-time Markov chain where random maps $\gamma[\omega]$ are applied with Poisson rate ${\mathbf{r}}(\mathrm{d}\omega)$ . A representation of a probability kernel $K$ in terms of a random map $\gamma[\bm{\omega}]$ as in (1.5) is called a random mapping representation.

More generally, when the function $\kappa$ is not identically one, Aldous and Bandyopadhyay [AB05] have shown that the iterates $T^{k}$ of the map $T$ from (1.1) can be represented in terms of a Finite Recursive Tree Process (FRTP), which is a generalization of a discrete-time Markov chain where time has a tree-like structure. More precisely, they construct a finite tree of depth $k$ where the state of each internal vertex is a random function of the states of its offspring. If the states of the leaves are i.i.d. with law $\mu$ , they show that the state at the root has law $T^{k}(\mu)$ . They are especially interested in fixed points of $T$ , which generalize the concept of an invariant law of a Markov chain. They show that each such fixed point $\nu$ gives rise to a Recursive Tree Process (RTP), which is a process on an infinite tree where the state of each vertex has law $\nu$ . One can think of such an RTP as a generalization of a stationary backward Markov chain $(\ldots,X_{-2},X_{-1},X_{0})$ . A fixed point equation of the form $T(\nu)=\nu$ is called a Recursive Distributional Equation (RDE). Studying RDEs and their solutions is of independent interest as they appear naturally in many applications, see for example [AB05, Als12].

In the present paper, we develop an analogue theory in continuous time, generalizing the concept of a continuous-time Markov chain to chains where time has a tree-like structure. Let $(T_{t})_{t\geq 0}$ be the semigroup defined by

[TABLE]

In Theorem 6, we show that $T_{t}$ has a representation similar to (1.1), namely

[TABLE]

where ${\mathbb{T}}$ is a countable set, $G_{t}:S^{\mathbb{T}}\to S$ is a random map, and the $(X_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ are i.i.d. with law $\mu$ and independent of $G_{t}$ . Similar to what we have in (1.3), the map $G_{t}$ does not depend on all coordinates in ${\mathbb{T}}$ but only on a finite subcollection $(X_{\mathbf{i}})_{\mathbf{i}\in\nabla{\mathbb{S}}_{t}}$ . Here $(\nabla{\mathbb{S}}_{t})_{t\geq 0}$ turns out to be a branching process and condition (1.4) (which is not needed in the discrete-time theory) guarantees that the offspring distribution of this branching process has finite mean. Similarly to (1.5), we can view (1.7) as a random mapping representation of the operator in (1.6).

As we have already mentioned, in Theorem 5 below, we prove that the mean-field limits of a large class of interacting particle systems are described by equations of the form (1.2). These interacting particle systems are constructed by applying local maps at the times of associated Poisson processes, which are introduced in detail in Section 1.3.

We are not only interested in the mean-field limit of a single process, but mainly in the mean-field limit of $n$ coupled processes that are constructed from the same Poisson processes. For each $n\geq 1$ , a measurable map $g:S^{k}\to S$ gives rise to $n$ -variate map $g^{(n)}:(S^{n})^{k}\to S^{n}$ defined as

[TABLE]

where we denote an element of $(S^{n})^{k}$ as $(x^{m}_{i})^{m=1,\ldots,n}_{i=1,\ldots,k}$ with $x_{i}=(x^{1}_{i},\ldots,x^{n}_{i})$ and $x^{m}=(x^{m}_{1},\ldots,x^{m}_{k})$ . Let ${\cal P}(S)$ denote the space of probability measures on a space $S$ . Letting $\gamma^{(n)}[\omega]$ denote the $n$ -variate map associated with $\gamma[\omega]$ then, in analogy to (1.1),

[TABLE]

defines an $n$ -variate map $T^{(n)}:{\cal P}(S^{n})\to{\cal P}(S^{n})$ , which as in (1.2) gives rise to an $n$ -variate mean-field equation, which describes the mean-field limit of $n$ coupled processes.

If $X$ is an $S$ -valued random variable whose law $\nu:={\mathbb{P}}[X\in\,\cdot\,]$ is a fixed point of $T$ , then $\overline{\nu}^{(n)}:={\mathbb{P}}[(X,\ldots,X)\in\,\cdot\,]$ is a fixed point of $T^{(n)}$ that describes $n$ perfectly coupled processes. We will be interested in the stability (or instability) of $\overline{\nu}^{(n)}$ under the $n$ -variate mean-field equation. In other words, for our mean-field interacting particle systems, we fix the Poisson processes used in the construction and want to know if small changes in the initial state lead to small (or large) changes in the final state. Aldous and Bandyopadhyay [AB05] define an RTP to be endogenous if the state at the root is a measurable function of the random maps attached to all vertices of the tree. They showed, in some precise sense (see Theorem 10 below), that endogeny is equivalent to stability of $\overline{\nu}^{(n)}$ . In Theorem 11, we generalize their result to the continuous-time setting.

The $n$ -variate map $T^{(n)}$ is well-defined even for $n=\infty$ , and $T^{(\infty)}$ maps the space of all exchangeable probability laws on $S^{{\mathbb{N}}_{+}}$ into itself. Let $\xi$ be a ${\cal P}(S)$ -valued random variable with law $\rho\in{\cal P}({\cal P}(S))$ , and conditional on $\xi$ , let $(X^{m})^{m=1,2,\ldots}$ be i.i.d. with common law $\xi$ . Then the unconditional law of $(X^{m})^{m=1,2,\ldots}$ is exchangeable, and by De Finetti, each exchangeable law on $S^{{\mathbb{N}}_{+}}$ is of this form. In view of this, $T^{(\infty)}$ naturally gives rise to a map $\check{T}:{\cal P}({\cal P}(S))\to{\cal P}({\cal P}(S))$ which is the higher-level map defined in [MSS18], and which analogously to (1.2) gives rise to a higher-level mean-field equation. For any $\nu\in{\cal P}(S)$ , let ${\cal P}({\cal P}(S))_{\nu}$ denote the set of all $\rho\in{\cal P}({\cal P}(S))$ with mean $\int\rho(\mathrm{d}\mu)\mu=\nu$ . In [MSS18] it is shown that if $\nu$ is a fixed point of $T$ , then the corresponding higher-level map $\check{T}$ has two fixed points $\underline{\nu}$ and $\overline{\nu}$ in ${\cal P}({\cal P}(S))_{\nu}$ that are minimal and maximal with respect to the convex order, defined in Theorem 14 below. Moreover, $\underline{\nu}=\overline{\nu}$ if and only if the RTP corresponding to $\nu$ is endogenous.

We will apply the theory developed here as well as in [MSS18] to the higher-level mean-field equation for a particular interacting particle system with cooperative branching and deaths; see also [SS15, Mac17, BCH18] for several different variants of the model. To formulate this properly, it is useful to introduce some more general notation. Recall that for each $\omega\in\Omega$ , $\gamma[\omega]$ is a map from $S^{\kappa(\omega)}$ into $S$ . We let

[TABLE]

denote the set of all maps that can be obtained by varying $\omega$ . Here, elements of ${\cal G}$ are measurable maps $g:S^{k}\to S$ where $k=k_{g}\geq 0$ may depend on $g$ . If $k=0$ , then $S^{0}$ is defined to be a set with just one element, which we denote by $\varnothing$ (the empty sequence, which we distinguish notationally from the empty set $\emptyset$ ). We equip ${\cal G}$ with the final $\sigma$ -field for the map $\omega\mapsto\gamma[\omega]$ and let $\pi$ denote the image of the measure ${\mathbf{r}}$ under this map. Then the mean-field equation (1.2) can be rewritten as

[TABLE]

where for any measurable map $g:S^{k}\to S$ ,

[TABLE]

In the concrete example that we are interested in, $S:=\{0,1\}$ and ${\cal G}:=\{{\mbox{\tt cob}},{\mbox{\tt dth}}\}$ each have just two elements. Here ${\mbox{\tt cob}}:S^{3}\to S$ and ${\mbox{\tt dth}}:S^{0}\to S$ are maps defined as

[TABLE]

We choose

[TABLE]

Then the mean-field equation (1.11) takes the form

[TABLE]

which describes the mean-field limit of a particle system with cooperative branching (with rate $\alpha$ ) and deaths (with rate 1). We will see that for $\alpha>4$ , (1.11) has two stable fixed points $\nu_{\rm low},\nu_{\rm upp}$ , and an unstable fixed point $\nu_{\rm mid}$ that separates the domains of attraction of the stable fixed points.

In Theorem 17 below, we find all fixed points of the corresponding higher-level mean-field equation, and determine their domains of attraction. Note that solutions of the higher-level mean-field equation take values in the probability measures on ${\cal P}(\{0,1\})\cong[0,1]$ . As mentioned before, each fixed point $\nu$ of the original mean-field equation gives rise to two fixed points $\underline{\nu},\overline{\nu}$ of the higher-level mean-field equation, which are minimal and maximal in ${\cal P}({\cal P}(S))_{\nu}$ with respect to the convex order. Moreover, $\underline{\nu}=\overline{\nu}$ if and only if the RTP corresponding to $\nu$ is endogenous. In our example, we find that the stable fixed points $\nu_{\rm low},\nu_{\rm upp}$ give rise to endogenous RTPs, but the RTP associated with $\nu_{\rm mid}$ is not endogenous. The higher-level equation has no other fixed points in ${\cal P}({\cal P}(S))_{\nu_{\rm mid}}$ except for $\underline{\nu}_{\rm mid}$ and $\overline{\nu}_{\rm mid}$ , of which the former is stable and the latter unstable. Numerical data for the nontrivial fixed point $\underline{\nu}_{\rm mid}$ (viewed as a probability measure on $[0,1]$ ) are plotted in Figure 2.

1.2 The mean-field equation

In this subsection, we collect some basic results about the mean-field equation (1.2) that form the basis for all that follows. We interpret (1.2) in the following sense: letting $\langle\mu,\phi\rangle:=\int\phi\,\mathrm{d}\mu$ , we say that a process $(\mu_{t})_{t\geq 0}$ solves (1.2) if for each bounded measurable function $\phi:S\to{\mathbb{R}}$ , the function $t\mapsto\langle\mu_{t},\phi\rangle$ is continuously differentiable and

[TABLE]

Our first result gives sufficient conditions for existence and uniqueness of solutions to (1.2).

Theorem 1 (Mean-field equation)

Let $S$ and $\Omega$ be Polish spaces, let ${\mathbf{r}}$ be a nonzero finite measure on $\Omega$ , and let $\gamma:\Omega\times S^{{\mathbb{N}}_{+}}\to S$ be measurable. Assume that there exists a measurable function $\kappa:\Omega\to{\mathbb{N}}$ such that (1.3) and (1.4) hold. Then the mean-field equation (1.2) has a unique solution $(\mu_{t})_{t\geq 0}$ for each initial state $\mu_{0}\in{\cal P}(S)$ .

Theorem 1 allows us to define a semigroup $(T_{t})_{t\geq 0}$ of operators $T_{t}:{\cal P}(S)\to{\cal P}(S)$ as in (1.6). It is often useful to know that solutions to (1.2) are continuous as a function of their initial state. The following proposition gives continuity w.r.t. the total variation norm $\|\,\cdot\,\|$ and moreover shows that if the constant $K$ from (1.18) is negative, then the operators $(T_{t})_{t\geq 0}$ form a contraction semigroup.

Proposition 2 (Continuity in total variation norm)

Under the assumptions of Theorem 1, one has

[TABLE]

where

[TABLE]

Continuity w.r.t. weak convergence needs an additional assumption.

Proposition 3 (Continuity w.r.t. weak convergence)

Assume that

[TABLE]

Then the operator $T$ in (1.1) and the operators $T_{t}$ $(t\geq 0)$ in (1.6) are continuous w.r.t. the topology of weak convergence.

The condition (1.19) is considerably weaker than the condition that $\gamma[\omega]$ is continuous for all $\omega\in\Omega$ . A simple example is $\Omega=S=[0,1]$ , ${\mathbf{r}}$ the Lebesgue measure, $\kappa\equiv 1$ , and $\gamma[\omega](x):=1_{\{x\geq\omega\}}$ .

1.3 The mean-field limit

In this subsection, we show that equations of the form (1.2) arise as the mean-field limits of a large class of interacting particle systems. In order to be reasonably general, and in particular to allow for systems in which more than one site can change its value at the same time, we will introduce quite a bit of notation that will not be needed anywhere else in Section 1, so impatient readers can just glance at Theorem 5 and the discussion surrounding (1.36) and skip the rest of this subsection.

Let $S$ be a Polish space as before, and let $N\in{\mathbb{N}}_{+}$ . We will be interested in continuous-time Markov processes taking values in $S^{N}$ , where $N$ is large. Denoting an element of $S^{N}$ by $x=(x_{1},\ldots,x_{N})$ , we will focus on processes with a high degree of symmetry, in the sense that their dynamics are invariant under a permutation of the coordinates. It is instructive, though not necessary for what follows, to view $\{1,\ldots,N\}$ as the vertex set of a complete graph, where all vertices are neighbors of each other. The basic ingredients we will use to describe our processes are:

(i)

a Polish space $\Omega^{\prime}$ equipped with a finite nonzero measure ${\mathbf{q}}$ , 2. (ii)

a measurable function $\lambda:\Omega^{\prime}\to{\mathbb{N}}_{+}$ ,

as well as, for each $\omega\in\Omega^{\prime}$ and $1\leq i\leq\lambda(\omega)$ ,

(iii)

a function $\gamma_{i}[\omega]:S^{\lambda(\omega)}\to S$ , 2. (iv)

a finite set $K_{i}(\omega)\subset\{1,\ldots,\lambda(\omega)\}$ such that $\gamma_{i}[\omega](x_{1},\ldots,x_{\lambda(\omega)})=\gamma_{i}[\omega]\big{(}(x_{i})_{i\in K_{i}(\omega)}\big{)}$ depends only on the coordinates in $K_{i}(\omega)$ .

Setting $\Omega^{\prime}_{l}:=\{\omega\in\Omega^{\prime}:\lambda(\omega)=l\}$ , we assume that the functions

[TABLE]

for each $1\leq i,j\leq l$ . We let $\vec{\gamma}[\omega]:S^{\lambda(\omega)}\to S^{\lambda(\omega)}$ denote the function

[TABLE]

and let $\kappa_{i}(\omega):=|K_{i}(\omega)|$ denote the cardinality of the set $K_{i}(\omega)$ .

The space $\Omega^{\prime}$ , measure ${\mathbf{q}}$ , and functions $\lambda$ and $\vec{\gamma}$ play roles similar, but not quite identical to $\Omega,{\mathbf{r}},\kappa$ , and $\gamma$ from Subsection 1. We can use $\Omega^{\prime},{\mathbf{q}},\lambda$ , and $\vec{\gamma}$ to define the following mean-field equation:

[TABLE]

The following lemma says that (1.22) is really a mean-field equation of the form we have already seen in (1.2). This is why in subsequent sections we will only work with equations of this form.

Lemma 4 (Simplified equation)

Assume that

[TABLE]

Then equation (1.22) can be cast in the simpler form (1.2) for a suitable choice of $\Omega,{\mathbf{r}},\kappa$ , and $\gamma$ , where (1.23) (i) guarantees that ${\mathbf{r}}$ is a finite measure and (1.23) (ii) implies that (1.4) holds. If

[TABLE]

then moreover (1.19) can be satisfied.

We now use the ingredients $\Omega^{\prime},{\mathbf{q}},\lambda$ , and $\vec{\gamma}$ to define the class of Markov processes we are interested in. We construct these processes by applying local maps, that affect only finitely many coordinates, at the times of associated Poisson processes. In the context of interacting particle systems, such constructions are called graphical representations.

For any $N\in{\mathbb{N}}_{+}$ we set $[N]:=\{1,\ldots,N\}$ . We let $[N]^{\langle l\rangle}$ denote the set of all sequences $\mathbf{i}=(i_{1},\ldots,i_{l})$ for which $i_{1},\ldots,i_{l}\in[N]$ are all different. Note that $[N]^{\langle l\rangle}$ has $N^{\langle l\rangle}:=N(N-1)\cdots(N-l+1)$ elements. We will consider Markov processes $X=(X(t))_{t\geq 0}$ with values in $S^{N}$ that evolve in the following way:

(i)

At the times of a Poisson process with intensity $|{\mathbf{q}}|:={\mathbf{q}}(\Omega^{\prime})$ , an element $\omega\in\Omega^{\prime}$ is chosen according to the probability law $|{\mathbf{q}}|^{-1}{\mathbf{q}}$ . 2. (ii)

If $\lambda(\omega)>N$ , nothing happens. 3. (iii)

Otherwise, an element $\mathbf{i}\in[N]^{\langle\lambda(\omega)\rangle}$ is selected according to the uniform distribution on $[N]^{\langle\lambda(\omega)\rangle}$ , and the previous values $\big{(}X_{i_{1}}(t-),\ldots,X_{i_{\lambda(\omega)}}(t-)\big{)}$ of $X$ at the coordinates $i_{1},\ldots,i_{\lambda(\omega)}$ are replaced by $\big{(}X_{i_{1}}(t),\ldots,X_{i_{\lambda(\omega)}}(t)\big{)}=\vec{\gamma}[\omega]\big{(}X_{i_{1}}(t-),\ldots,X_{i_{\lambda(\omega)}}(t-)\big{)}$ .

More formally, we can construct our Markov process $X=(X(t))_{t\geq 0}$ as follows. For each $\omega\in\Omega^{\prime}$ with $\lambda(\omega)\leq N$ , and for each $\mathbf{i}\in[N]^{\langle\lambda(\omega)\rangle}$ , define a map $m_{\omega,\mathbf{i}}:S^{N}\to S^{N}$ by

[TABLE]

Let $\Pi$ be a Poisson point set on

[TABLE]

with intensity

[TABLE]

Since ${\mathbf{q}}$ is a finite measure, the set $\Pi_{s,u}:=\{(\omega,\mathbf{i},t)\in\Pi:s<t\leq u\}$ is a.s. finite for each $-\infty<s\leq u<\infty$ , so we can order its elements as

[TABLE]

and use this to define

[TABLE]

In words, $\Pi$ is a list of triples $(\omega,\mathbf{i},t)$ . Here $\omega$ represents some external input that tells us that we need to apply the map $\vec{\gamma}[\omega]$ . The coordinates where and the time when this map needs to be applied are given by $\mathbf{i}$ and $t$ , respectively. It is easy to see that the random maps $({\mathbf{X}}_{s,u})_{s\leq u}$ form a stochastic flow, i.e.,

[TABLE]

where $1$ denotes the identity map. Moreover $({\mathbf{X}}_{s,u})_{s\leq u}$ has independent increments in the sense that

[TABLE]

for each $t_{1}<\cdots<t_{k}$ . It is well-known (see, e.g., [SS18, Lemma 1]) that if $X(0)$ is an $S^{N}$ -valued random variable, independent of the Poisson set $\Pi$ , then setting

[TABLE]

defines a Markov process $X=(X(t))_{t\geq 0}$ with values in $S^{N}$ . Note that $(X(t))_{t\geq 0}$ has piecewise constant sample paths, which are right-continuous because of the way we have defined $\Pi_{s,u}$ .

We now formulate our result about the mean-field limit of Markov processes as defined in (1.32). For any $x\in S^{N}$ , we define an empirical measure $\mu\{x\}$ on $S$ by

[TABLE]

Below, $\mu^{\otimes n}:=\mu\otimes\cdots\otimes\mu$ denotes the product measure of $n$ copies of $\mu$ . The expectation ${\mathbb{E}}[\mu]$ of a random measure $\mu$ on a Polish space $S$ is defined in the usual way, i.e., ${\mathbb{E}}[\mu]$ is the deterministic measure defined by $\int\!\phi\,\mathrm{d}{\mathbb{E}}[\mu]:={\mathbb{E}}[\int\!\phi\,\mathrm{d}\mu]$ for any bounded measurable $\phi:S\to{\mathbb{R}}$ .

Theorem 5 (Mean-field limit)

Let $S$ be a Polish space, let $\Omega^{\prime},{\mathbf{q}},\lambda$ , and $\vec{\gamma}$ be as above, and assume (1.23). For each $N\in{\mathbb{N}}_{+}$ , let $(X^{(N)}(t))_{t\geq 0}$ be Markov processes with state space $S^{N}$ as defined in (1.32), and let $\mu^{N}_{t}:=\mu\{X^{(N)}(t)\}$ denote their associated empirical measures. Let $d$ be any metric on ${\cal P}(S)$ that generates the topology of weak convergence. Fix some (deterministic) $\mu_{0}\in{\cal P}(S)$ and assume that (at least) one of the following two conditions is satisfied.

(i)

$\displaystyle{\mathbb{P}}\big{[}d(\mu^{N}_{0},\mu_{0})\geq\varepsilon]\underset{{N}\to\infty}{\longrightarrow}0$ * for all $\varepsilon>0$ , and (1.24) holds.* 2. (ii)

$\big{\|}{\mathbb{E}}[(\mu^{N}_{0})^{\otimes n}]-\mu_{0}^{\otimes n}\big{\|}\underset{{N}\to\infty}{\longrightarrow}0$ * for all $n\geq 1$ , where $\|\,\cdot\,\|$ denotes the total variation norm.*

Then

[TABLE]

where $(\mu_{t})_{t\geq 0}$ is the unique solution to the mean-field equation (1.22) with initial state $\mu_{0}$ .

Condition (ii) is in particular satisfied if $X^{N}_{1}(0),\ldots,X^{N}_{N}(0)$ are i.i.d. with common law $\mu_{0}$ . Note that in (1.34), we rescale time by a factor $N$ .

It is instructive to demonstrate the general set-up on our concrete example of a particle system with cooperative branching and deaths. As before, we have $S=\{0,1\}$ . We choose for $\Omega^{\prime}$ a set with just two elements, say $\Omega^{\prime}=\{1,2\}$ , and we set ${\mathbf{q}}(\{1\}):=\alpha\geq 0$ and ${\mathbf{q}}(\{2\}):=1$ . We let $\lambda(1):=3$ , $\lambda(2):=1$ , and define $\vec{\gamma}[1]:S^{3}\to S^{3}$ and $\vec{\gamma}[2]:S\to S$ by

[TABLE]

Then the particle system in (1.32) has the following description. Let us say that a site $i$ is occupied at time $t$ if $X_{i}(t)=1$ . Then, with rate $\alpha$ , three sites $(i_{1},i_{2},i_{3})\in[N]^{\langle 3\rangle}$ are selected at random. If the sites $i_{2}$ and $i_{3}$ are both occupied, then the particles at these sites cooperate to produce a third particle at $i_{1}$ , provided this site is empty. In addition, with rate 1, a site $i$ is selected at random, and any particle that is present there dies.

It is not hard to see that for our choice of $\Omega^{\prime},{\mathbf{q}},\lambda$ , and $\vec{\gamma}$ , the mean-field equation (1.22) simplifies to (1.15), Note that since $\gamma_{2}[1]$ and $\gamma_{3}[1]$ are the identity map, they drop out of (1.22), so only $\gamma_{1}[1]={\mbox{\tt cob}}$ and $\gamma_{1}[2]={\mbox{\tt dth}}$ remain. Since $\gamma_{1}[2](x_{1})=0$ regardless of the value of $x_{1}$ , we can choose for $K_{1}(1)$ the empty set and view $\gamma_{1}[2]={\mbox{\tt dth}}$ as a function ${\mbox{\tt dth}}:S^{0}\to S$ .

Solutions of (1.15) take values in the probability measures on $S=\{0,1\}$ , which are uniquely characterized by their value at 1. Rewriting (1.15) in terms of $p_{t}:=\mu_{t}(\{1\})$ yields the equation

[TABLE]

This equation can also be found in [Nob92, (1.11)], [Neu94, (1.2)], [BW97, (3.1)], [FL17, (4)], and [BCH18, (2.1)]. It is not hard to check that for $\alpha<4$ , the only fixed point of (1.36) is $z_{\rm low}:=0$ , while for $\alpha\geq 4$ , there are additional fixed points

[TABLE]

If $\alpha<4$ , then solutions to (1.36) converge to $z_{\rm low}$ regardless of the initial state. On the other hand, for $\alpha\geq 4$ , solutions to (1.36) with $p_{0}>z_{\rm mid}$ converge to the upper fixed point $z_{\rm upp}$ while solutions to (1.36) with $p_{0}<z_{\rm mid}$ converge to the lower fixed point $z_{\rm low}$ . In particular, if $\alpha>4$ , then $z_{\rm low}$ and $z_{\rm upp}$ are stable fixed points while $z_{\rm mid}$ is an unstable fixed point separating the domains of attraction of $z_{\rm low}$ and $z_{\rm upp}$ .

1.4 A recursive tree representation

In this subsection we formally introduce Finite Recursive Tree Processes (FRTPs) and state the random mapping representation of solutions to the mean-field equation (1.2) anticipated in (1.7).

For $d\in{\mathbb{N}}_{+}$ , let ${\mathbb{T}}^{d}$ denote the space of all finite words $\mathbf{i}=i_{1}\cdots i_{n}$ $(n\in{\mathbb{N}})$ made up from the alphabet $\{1,\ldots,d\}$ , and define ${\mathbb{T}}^{\infty}$ similarly, using the alphabet ${\mathbb{N}}_{+}$ . If $\mathbf{i},\mathbf{j}\in{\mathbb{T}}^{d}$ with $\mathbf{i}=i_{1}\cdots i_{m}$ and $\mathbf{j}=j_{1}\cdots j_{n}$ , then we define the concatenation $\mathbf{i}\mathbf{j}\in{\mathbb{T}}^{d}$ by $\mathbf{i}\mathbf{j}:=i_{1}\cdots i_{m}j_{1}\cdots j_{n}$ . We denote the length of a word $\mathbf{i}=i_{1}\cdots i_{n}$ by $|\mathbf{i}|:=n$ and let $\varnothing$ denote the word of length zero. We view ${\mathbb{T}}^{d}$ as a tree with root $\varnothing$ , where each vertex $\mathbf{i}\in{\mathbb{T}}^{d}$ has $d$ children $\mathbf{i}1,\mathbf{i}2,\ldots$ , and each vertex $\mathbf{i}=i_{1}\cdots i_{n}$ except the root has precisely one ancestor ${\accentset{\leftarrow}{\mathbf{i}}}:=i_{1}\cdots i_{n-1}$ . For each rooted subtree of ${\mathbb{T}}^{d}$ , i.e., a subtree ${\mathbb{U}}\subset{\mathbb{T}}^{d}$ that contains $\varnothing$ , we let $\partial{\mathbb{U}}:=\{\mathbf{i}\in{\mathbb{T}}^{d}:{\accentset{\leftarrow}{\mathbf{i}}}\in{\mathbb{U}},\ \mathbf{i}\not\in{\mathbb{U}}\}$ denote the boundary of ${\mathbb{U}}$ relative to ${\mathbb{T}}^{d}$ . We write

[TABLE]

and use the convention $\partial\emptyset:=\{\varnothing\}$ , so that (1.38) holds also for $n=0$ .

We return to the set-up of Subsection 1.1, i.e., $S$ and $\Omega$ are Polish spaces, ${\mathbf{r}}$ is a nonzero finite measure on $\Omega$ , and $\gamma:\Omega\times S^{{\mathbb{N}}_{+}}\to S$ and $\kappa:\Omega\to{\mathbb{N}}$ are measurable functions such that (1.3) holds. We fix some $d\in{\mathbb{N}}_{+}\cup\{\infty\}$ such that $\kappa(\omega)\leq d$ for all $\omega\in\Omega$ and set ${\mathbb{T}}:={\mathbb{T}}^{d}$ . Let $(\bm{\omega}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ be an i.i.d. collection of $\Omega$ -valued r.v.’s with common law $|{\mathbf{r}}|^{-1}{\mathbf{r}}$ . Fix $n\geq 1$ and assume that

[TABLE]

Then it is easy to see that the law of $X_{\varnothing}$ is given by $T^{n}(\mu)$ , where $T^{n}$ is the $n$ -th iterate of the operator in (1.1). We call the collection of random variables

[TABLE]

a Finite Recursive Tree Process (FRTP). We can think of $(X_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}_{(n)}\cup\partial{\mathbb{T}}_{(n)}}$ as a generalization of a Markov chain, where time has a tree-like structure.

We now aim to give a similar representation of the semigroup $(T_{t})_{t\geq 0}$ from (1.6). To do this, we let $(\sigma_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ be i.i.d. exponentially distributed random variables with mean $|{\mathbf{r}}|^{-1}$ . We interpret $\sigma_{\mathbf{i}}$ as the lifetime of the individual with index $\mathbf{i}$ and let

[TABLE]

denote the times when the individual $\mathbf{i}$ is born and dies, respectively. Then

[TABLE]

are the (random) subtrees of ${\mathbb{T}}$ consisting of all individuals that have died before time $t$ , resp. are alive at time $t$ . If the function $\kappa$ from (1.3) is bounded, then we can choose ${\mathbb{T}}:={\mathbb{T}}^{d}$ with $d<\infty$ . Now it is easy to check that $(\partial{\mathbb{T}}_{t})_{t\geq 0}$ is a continuous-time branching process where each particle is with rate $|{\mathbf{r}}|$ replaced by $d$ new particles. In particular, ${\mathbb{T}}_{t}$ is a.s. finite for each $t>0$ . On the other hand, when $\kappa$ is unbounded, we need to choose ${\mathbb{T}}:={\mathbb{T}}^{\infty}$ , and this has the consequence that ${\mathbb{T}}_{t}$ is a.s. infinite for each $t>0$ . Nevertheless, under the assumption (1.4), it turns out that only a finite subtree of ${\mathbb{T}}_{t}$ is relevant for the state at the root $X_{\varnothing}$ , as we explain now.

Let ${\mathbb{S}}$ be the random subtree of ${\mathbb{T}}$ defined as

[TABLE]

and for each subtree ${\mathbb{U}}\subset{\mathbb{S}}$ , let $\nabla{\mathbb{U}}:=\{\mathbf{i}\in{\mathbb{S}}:{\accentset{\leftarrow}{\mathbf{i}}}\in{\mathbb{U}},\ \mathbf{i}\not\in{\mathbb{U}}\}$ denote the outer boundary of ${\mathbb{U}}$ relative to ${\mathbb{S}}$ , where again we use the convention that $\nabla{\mathbb{U}}:=\{\varnothing\}$ if ${\mathbb{U}}$ is the empty set. Then, under condition (1.4),

[TABLE]

are a.s. finite for all $t\geq 0$ . Indeed, $(\nabla{\mathbb{S}}_{t})_{t\geq 0}$ is a branching process where for each individual $\mathbf{i}$ , with Poisson rate ${\mathbf{r}}(\mathrm{d}\omega)$ , an element $\omega\in\Omega$ is selected and $\mathbf{i}$ is replaced by new individuals $\mathbf{i}1,\ldots,\mathbf{i}\kappa(\omega)$ . The condition on the rates (1.4) guarantees that this branching process has finite mean and in particular does not explode, so that ${\mathbb{S}}_{t}$ is a.s. a finite subtree of ${\mathbb{S}}$ .

Let $(\bm{\omega}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ be i.i.d. with common law $|{\mathbf{r}}|^{-1}{\mathbf{r}}$ , independent of the lifetimes $(\sigma_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ . For any finite rooted subtree ${\mathbb{U}}\subset{\mathbb{S}}$ and for each $(x_{\mathbf{i}})_{\mathbf{i}\in\nabla{\mathbb{U}}}=x\in S^{\nabla{\mathbb{U}}}$ , we can inductively define $x_{\mathbf{i}}$ for $\mathbf{i}\in{\mathbb{U}}$ by

[TABLE]

Then the value $x_{\varnothing}$ we obtain at the root is a function of $(x_{\mathbf{i}})_{\mathbf{i}\in\nabla{\mathbb{U}}}$ . Let us denote this function by $G_{\mathbb{U}}:S^{\nabla{\mathbb{U}}}\to S$ , i.e.,

[TABLE]

We can think of $G_{\mathbb{U}}$ as the “concatenation” of the maps $(\gamma[\bm{\omega}_{\mathbf{i}}])_{\mathbf{i}\in{\mathbb{U}}}$ . We will in particular be interested in the random maps

[TABLE]

with ${\mathbb{S}}_{t}$ as in (1.44). For our running example of a system with cooperative branching and deaths, these definitions are illustrated in Figure 1.

Let $({\cal F}_{t})_{t\geq 0}$ , defined as

[TABLE]

be the natural filtration associated with our evolving marked tree, that contains information about which individuals are alive at time $t$ , as well as the random elements $\bm{\omega}_{\mathbf{i}}$ and lifetimes $\sigma_{\mathbf{i}}$ associated with all individuals that have died by time $t$ . In particular, $G_{t}$ is measurable w.r.t. ${\cal F}_{t}$ . The following theorem is a precise formulation of the random mapping representation of solutions of the mean-field equation (1.2), anticipated in (1.7).

Theorem 6 (Recursive tree representation)

Let $S$ and $\Omega$ be Polish spaces, let ${\mathbf{r}}$ be a nonzero finite measure on $\Omega$ , and let $\gamma:\Omega\times S^{{\mathbb{N}}_{+}}\to S$ and $\kappa:\Omega\to{\mathbb{N}}$ be measurable functions satisfying (1.3) and (1.4). Let $(\bm{\omega}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{S}}}$ be i.i.d. with common law $|{\mathbf{r}}|^{-1}{\mathbf{r}}$ and let $(\sigma_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{S}}}$ be an independent i.i.d. collection of exponentially distributed random variables with mean $|{\mathbf{r}}|^{-1}$ . Fix $t\geq 0$ and let $G_{t}$ and ${\cal F}_{t}$ be defined as in (1.47) and (1.48). Conditional on ${\cal F}_{t}$ , let $(X_{\mathbf{i}})_{\mathbf{i}\in\nabla{\mathbb{S}}_{t}}$ be i.i.d. $S$ -valued random variables with common law $\mu$ . Then

[TABLE]

where $T_{t}$ is defined in (1.6).

Recalling the definition of $G_{t}$ , we can also formulate Theorem 6 as follows. With $(\bm{\omega}_{\mathbf{i}},\sigma_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{S}}}$ as above, fix $t>0$ and let $(X_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{S}}_{t}\cup\nabla{\mathbb{S}}_{t}}$ be random variables such that

[TABLE]

Then (1.49) says that the state at the root $X_{\varnothing}$ has law $T_{t}(\mu)$ . This is a continuous-time analogue of the FRTP (1.39).

In our proofs, we will first prove Theorem 6 and then use this to prove Theorem 5 about the mean-field limit of interacting particle systems. Recall that these particle systems are constructed from a stochastic flow $({\mathbf{X}}_{s,t})_{s\leq t}$ as in (1.32). To find the empirical measure of $X(t)={\mathbf{X}}_{0,t}(X(0))$ , we pick a site $i\in[N]$ at random and ask for its type $X_{i}(t)$ which via ${\mathbf{X}}_{0,t}$ is a function of the initial state $X(0)$ . When $N$ is large, $X_{i}(t)$ does not depend on all coordinates $(X_{j}(0))_{j\in[N]}$ but only on a random subset of them, and indeed one can show that the map that gives $X_{i}(t)$ as a function of these coordinates approximates the map $G_{t}$ from Theorem 6, in an appropriate sense. The heuristics behind this are explained in some more detail in Subsection 4.1 below.

Remark Another way to write (1.49) is

[TABLE]

where $T_{G_{t}}$ is defined as in (1.12) for the random map $G_{t}$ and $(\mu_{t})_{t\geq 0}$ is a solution to (1.2). One can check that $(\nabla{\mathbb{S}}_{t},G_{t})_{t\geq 0}$ is a Markov process. Let us informally denote this process by $(G_{t})_{t\geq 0}$ and its state space by ${\cal G}$ . Then equation (1.49) can be understood as a (generalized) duality relationship between $(G_{t})_{t\geq 0}$ and $(\mu_{t})_{t\geq 0}$ with (generalized) duality function $H:{\cal G}\times{\cal P}(S)\to{\cal P}(S)$ given by

[TABLE]

With this definition, using the fact that $G_{0}$ is the identity map, (1.51) reads

[TABLE]

and we can obtain a family of usual (real-valued) dualities by integrating against a test function $\phi$ .

1.5 Recursive tree processes

Recall the definition of the operator $T$ in (1.1) and the semigroup $(T_{t})_{t\geq 0}$ in (1.6). It is clear from (1.2) that for a measure $\nu\in{\cal P}(S)$ , the following two conditions are equivalent:

[TABLE]

We call such a measure $\nu$ a fixed point of the mean-field equation (1.2). Condition (ii) is equivalent to saying that a random variable $X$ with law $\nu$ satisfies

[TABLE]

where $\stackrel{{\scriptstyle\scriptstyle\rm d}}{{=}}$ denotes equality in distribution, $X_{1},X_{2},\ldots$ are i.i.d. copies of $X$ , and $\omega$ is an independent $\Omega$ -valued random variable with law $|{\mathbf{r}}|^{-1}{\mathbf{r}}$ . Equations of this type are called Recursive Distributional Equations (RDEs).

FRTPs as in (1.39) are consistent in the sense that if $(X_{\mathbf{i}})_{\mathbf{i}\in{\partial{\mathbb{T}}_{(n)}}}$ are as in (1.39), then for any $1\leq m\leq n$ ,

[TABLE]

The following lemma states a similar consistency property in the continuous-time setting.

Lemma 7 (Consistency)

*Fix $t>0$ and let $(X_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{S}}_{t}\cup\nabla{\mathbb{S}}_{t}}$ be as in (1.50).

Then, for each $s\in[0,t]$ ,*

[TABLE]

where $(T_{t})_{t\geq 0}$ is defined in (1.6).

Using the consistency relation (1.56) and Kolmogorov’s extension theorem, it is not hard to see that if $\nu$ solves the RDE (1.54), then it is possible to define a stationary recursive process on an infinite tree such that each vertex has law $\nu$ . This was already observed in [AB05]. The following lemma is a slight reformulation of their observation.

Lemma 8 (Recursive Tree Process)

Let $\nu$ be a solution to the RDE (1.54). Then there exists a collection $(\bm{\omega}_{\mathbf{i}},X_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ of random variables whose joint law is uniquely characterized by the following requirements.

[TABLE]

We call a collection of random variables $(\bm{\omega}_{\mathbf{i}},X_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ as in Lemma 8 the Recursive Tree Process (RTP) corresponding to the map $\gamma$ and the solution $\nu$ of the RDE (1.54). We can view such an RTP as a generalization of a stationary backward Markov chain. For most purposes, we will only need the random variables $\bm{\omega}_{\mathbf{i}},X_{\mathbf{i}}$ with $\mathbf{i}\in{\mathbb{S}}$ , the random subtree defined in (1.43). The following proposition shows that by adding independent exponential lifetimes to an RTP, we obtain a stationary version of (1.57).

Proposition 9 (Continuous-time RTP)

Let $(\bm{\omega}_{\mathbf{i}},X_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ be an RTP corresponding to a solution $\nu$ of the RDE (1.54), and let $(\sigma_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ be an independent i.i.d. collection of exponentially distributed random variables with mean $|{\mathbf{r}}|^{-1}$ . Then, for each $t\geq 0$ ,

[TABLE]

At the end of Subsection 1.3 we have seen that in our example of a system with cooperative branching, the RDE (1.54) has three solutions when the branching rate satisfies $\alpha>4$ , two solutions for $\alpha=4$ , and only one solution for $\alpha<4$ . For $\alpha>4$ , the solutions to the RDE are $\nu_{\rm low},\nu_{\rm mid}$ , and $\nu_{\rm upp}$ , where we let $\nu_{\rm\ldots}$ denote the probability measure on $\{0,1\}$ with mean $\nu_{\rm\ldots}(\{1\})=z_{\rm\ldots}$ $(\ldots={\rm low},{\rm mid},{\rm upp})$ as defined around (1.37). By Lemma 8, each of these solutions to the RDE defines an RTP.

1.6 Endogeny and bivariate uniqueness

In [AB05, Def 7], an RTP $(\bm{\omega}_{\mathbf{i}},X_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ corresponding to a solution $\nu$ of the RDE (1.54) is called endogenous if $X_{\varnothing}$ is a.s. measurable w.r.t. the $\sigma$ -field generated by the random variables $(\bm{\omega}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ . In Lemma 46 below, we will show that this is equivalent to $X_{\varnothing}$ being a.s. measurable w.r.t. the $\sigma$ -field generated by the random variables ${\mathbb{S}}$ and $(\bm{\omega}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{S}}}$ , where ${\mathbb{S}}$ is the random tree defined in (1.43). Aldous and Bandyopadhyay have shown that endogeny is equivalent to bivariate uniqueness, which we now explain.

Let ${\cal P}_{\rm sym}(S^{n})$ denote the space of probability measures on $S^{n}$ that are symmetric with respect to permutations of the coordinates. Let $\pi_{m}:S^{n}\to S$ denote the projection on the $m$ -th coordinate, i.e., $\pi_{m}(x^{1},\ldots,x^{n}):=x_{m}$ , and let $\mu^{(n)}\circ\pi_{m}^{-1}$ denote the $m$ -th marginal of a measure $\mu^{(n)}\in{\cal P}(S^{n})$ . For any $\mu\in{\cal P}(S)$ , we define

[TABLE]

to be the set of probability measures on $S^{n}$ whose one-dimensional marginals are all equal to $\mu$ , and we denote ${\cal P}_{\rm{sym}}(S^{n})_{\mu}:={\cal P}_{\rm{sym}}(S^{n})\cap{\cal P}(S^{n})_{\mu}$ . Finally, we define a “diagonal” set

[TABLE]

and given a measure $\mu\in{\cal P}(S)$ , we let $\overline{\mu}^{(n)}$ denote the unique element of ${\cal P}(S^{n})_{\mu}\cap{\cal P}(S^{n}_{\rm diag})$ , i.e.,

[TABLE]

Recall the definition of the $n$ -variate map $T^{(n)}$ in (1.9). The following theorem has been proved in [MSS18, Thm 1], and in a slightly weaker form in [AB05, Thm 11]. Below, $\Rightarrow$ denotes weak convergence of probability measures.

Theorem 10 (Endogeny and $n$ -variate uniqueness)

Let $\nu$ be a solution of the RDE (1.54). Then the following statements are equivalent.

(i)

The RTP corresponding to $\nu$ is endogenous. 2. (ii)

$\displaystyle(T^{(n)})^{m}(\mu)\underset{{m}\to\infty}{\Longrightarrow}\overline{\nu}^{(n)}$ * for all $\mu\in{\cal P}(S^{n})_{\nu}$ and $n\geq 1$ .* 3. (iii)

$\overline{\nu}^{(2)}$ * is the only fixed point of $T^{(2)}$ in the space ${\cal P}_{\rm{sym}}(S^{2})_{\nu}$ .*

We remark that bivariate uniqueness as introduced in [AB05] refers to $\overline{\nu}^{(2)}$ being the only fixed point of $T^{(2)}$ in the space ${\cal P}(S^{2})_{\nu}$ . The equivalences in the above theorem tells us that bivariate uniqueness already follows from the weaker condition (iii) since it implies (ii), which implies n-variate uniqueness for any $n\geq 1$ .

We will prove a continuous-time extension of Theorem 10, relating endogeny to solutions of the $n$ -variate mean-field equation

[TABLE]

where we have replaced $T$ in (1.2) by $T^{(n)}$ and we write $\mu^{(n)}_{t}$ to remind ourselves that this is a measure on $S^{n}$ , rather than on $S$ .

This equation has the following interpretation. As in Subsection 1.3, let $({\mathbf{X}}_{s,u})_{s\leq u}$ be a stochastic flow on $S^{N}$ constructed from a Poisson point set $\Pi$ . Let $(X^{1}(0),\ldots,X^{n}(0))$ be a random variable with values in $S^{n}$ , independent of $({\mathbf{X}}_{s,u})_{s\leq u}$ . Then setting

[TABLE]

defines a Markov process $(X^{1}(t),\ldots,X^{n}(t))_{t\geq 0}$ that consists of $n$ Markov processes with initial states $X^{1}(0),\ldots,X^{n}(0)$ that are coupled in such a way that they are constructed using the same stochastic flow. Applying Theorem 5 to this $n$ -variate Markov process, we see that the mean-field equation for the $n$ -variate process takes the form (1.63).

We note that if $\mu^{(n)}_{t}$ solves the $n$ -variate mean-field equation, then any $m$ -dimensional marginal of $\mu^{(n)}_{t}$ solves the $m$ -variate mean-field equation. Also, solutions to (1.63) started in an initial condition $\mu^{(n)}_{0}\in{\cal P}_{\rm sym}(S^{n})$ satisfy $\mu^{(n)}_{t}\in{\cal P}_{\rm sym}(S^{n})$ for all $t\geq 0$ . Finally, it is easy to see that $\mu^{(n)}_{0}\in{\cal P}(S^{n}_{\rm diag})$ implies $\mu^{(n)}_{t}\in{\cal P}(S^{n}_{\rm diag})$ for all $t\geq 0$ .

We now formulate a continuous-time extension of Theorem 10. Note that in view of (1.54), a measure $\nu^{(2)}$ is a fixed point of the bivariate mean-field equation (i.e., (1.63) with $n=2$ ) if and only if it is a fixed point of $T^{(2)}$ . Therefore, the equivalence of points (i) and (iii) from Theorem 10 immediately implies an analogue statement in the continuous-time setting.

Theorem 11 (Endogeny and the n-variate mean-field equation)

Under the assumptions of Theorem 10, the following conditions are equivalent.

(i)

The RTP corresponding to $\nu$ is endogenous. 2. (ii)

For any $\mu^{(n)}_{0}\in{\cal P}(S^{n})_{\nu}$ and $n\geq 1$ , the solution $(\mu^{(n)}_{t})_{t\geq 0}$ to the $n$ -variate equation (1.63) started in $\mu^{(n)}_{0}$ satisfies $\displaystyle\mu^{(n)}_{t}\underset{{t}\to\infty}{\Longrightarrow}\overline{\nu}^{(n)}$ .

Theorem 11 motivates us to study the bivariate mean-field equation in our example of a particle system with cooperative branching. Recall that in this example, ${\cal G}:=\{{\mbox{\tt cob}},{\mbox{\tt dth}}\}$ with cob and dth as in (1.13), and $\pi$ is defined in (1.14). In line with (1.15) we write the bivariate mean-field equation as

[TABLE]

For simplicity, we restrict ourselves to symmetric solutions, i.e., solutions that take values in ${\cal P}_{\rm sym}(\{0,1\}^{2})$ . For any probability measure $\mu^{(2)}\in{\cal P}_{\rm sym}(\{0,1\}^{2})$ , we let $\mu^{(1)}$ denote its one-dimensional marginals, which are equal by symmetry. We let $\nu_{\rm low},\nu_{\rm mid},\nu_{\rm upp}$ denote the probability measures on $\{0,1\}$ with mean $\nu_{\rm\ldots}(\{1\})=z_{\rm\ldots}$ $(\ldots={\rm low},{\rm mid},{\rm upp})$ as defined around (1.37).

Proposition 12 (Bivariate equation for cooperative branching)

For $\alpha>4$ , the bivariate mean-field equation (1.65) has precisely four fixed points in ${\cal P}_{\rm sym}(\{0,1\}^{2})$ , namely

[TABLE]

which are uniquely characterized by their respective marginals $\nu_{\rm low},\nu_{\rm mid},\nu_{\rm mid},\nu_{\rm upp}$ , as well as the fact that $\overline{\nu}^{(2)}_{\rm low},\overline{\nu}^{(2)}_{\rm mid}$ , and $\overline{\nu}^{(2)}_{\rm upp}$ are concentrated on $\{0,1\}^{2}_{\rm diag}=\{(0,0),(1,1)\}$ , but $\underline{\nu}^{(2)}_{\rm mid}$ is not.

For any $\mu^{(2)}_{0}\in{\cal P}_{\rm sym}(\{0,1\}^{2})$ , the solution to (1.65) started in $\mu^{(2)}_{0}$ converges as $t\to\infty$ to one of the fixed points in (1.66), the respective domains of attraction being

[TABLE]

For $\alpha=4$ , there are two fixed points $\overline{\nu}^{(2)}_{\rm low}$ and $\overline{\nu}^{(2)}_{\rm upp}$ with respective domains of attraction

[TABLE]

while for $\alpha<4$ all solutions converge to $\overline{\nu}^{(2)}_{\rm low}$ .

Combining Proposition 12 with Theorem 11, we see that the RTPs corresponding to $\nu_{\rm low}$ and $\nu_{\rm upp}$ are endogenous, but for $\alpha>4$ , the RTP corresponding to $\nu_{\rm mid}$ is not. As is clear from [AB05, Table 1], few examples of nonendogenous RTPs were known at the time. Contrary to what is stated in [AB05, Table 1], frozen percolation is now generally conjectured to be nonendogenous, but until recently few “natural” examples of nonendogenous RTPs have appeared in the literature. In fact, the RTP corresponding to $\nu_{\rm mid}$ seems to be one of the simplest nontrivial examples of a nonendogenous RTP discovered so far. Another nice class of nonendogenous RTPs has recently been described in [MS18].

1.7 The higher-level mean-field equation

Following [MSS18, formula (1.1)], if $S$ is a Polish space and $g:S^{k}\to S$ is a measurable map, then we define a measurable map $\check{g}:{\cal P}(S)^{k}\to{\cal P}(S)$ by

[TABLE]

Note that in this notation, the map $T_{g}:{\cal P}(S)\to{\cal P}(S)$ from (1.12) is given by $T_{g}(\mu)=\check{g}(\mu,\ldots,\mu)$ . As in [MSS18, formula (4.2)], we define a higher-level map $\check{T}:{\cal P}({\cal P}(S))\to{\cal P}({\cal P}(S))$ by

[TABLE]

where $\bm{\omega}$ is an $\Omega$ -valued random variable with law $|{\mathbf{r}}|^{-1}{\mathbf{r}}$ and $(\xi_{i})_{i\geq 1}$ are i.i.d. ${\cal P}(S)$ -valued random variables with law $\rho$ . Iterates of the map $\check{T}$ have been studied in [MSS18, Section 4]. We will be interested in the higher-level mean-field equation

[TABLE]

A measure $\rho\in{\cal P}({\cal P}(S))$ is the law of a random probability measure $\xi$ on $S$ . We denote the $n$ -th moment measure of such a random measure $\xi$ by

[TABLE]

(Here ${\mathbb{E}}[\,\cdot\,]$ denotes the expectation of a random measure; see the remark above Theorem 5.) Our notation for moment measures is on purpose similar to our earlier notation for solutions to the $n$ -variate equation, because of the following proposition.

Proposition 13 (Moment measures)

If $(\rho_{t})_{t\geq 0}$ solves the higher-level mean-field equation (1.71), then its $n$ -th moment measures $(\rho^{(n)}_{t})_{t\geq 0}$ solve the $n$ -variate equation (1.63).

Similarly to Proposition 13, it has been shown in [MSS18, Lemma 2] that $\check{T}(\rho)^{(n)}=T^{(n)}(\rho^{(n)})$ , and this formula holds even for $n=\infty$ . In view of this, as discussed in Subsection 1.1, the higher-level map $\check{T}$ is effectively equivalent to the $\infty$ -variate map ${\mathbb{T}}^{(\infty)}:{\cal P}_{\rm sym}(S^{\infty})\to{\cal P}_{\rm sym}(S^{\infty})$ . It follows from Proposition 13 that if $\rho$ solves the higher-level RDE

[TABLE]

then its $n$ -th moment measures solve the $n$ -variate RDE $T^{(n)}(\rho^{(n)})=\rho^{(n)}$ , with $T^{(n)}$ as in (1.9).

If $X$ is an $S$ -valued random variable defined on some probability space $(\Omega,{\cal F},{\mathbb{P}})$ and ${\cal H}\subset{\cal F}$ is a sub- $\sigma$ -field, then ${\mathbb{P}}[X\in\,\cdot\,|{\cal H}]$ is a random probability measure333Here we use that since $S$ is Polish, regular versions of conditional expectations exist. on $S$ . As a consequence, the law of ${\mathbb{P}}[X\in\,\cdot\,|{\cal H}]$ is an element of ${\cal P}({\cal P}(S))$ . In the following theorem, which is based on [Str65, Thm 2] and which in its present form we cite from [MSS18, Thm 13], we use the fact that each Polish space $S$ has a metrizable compactification $\overline{S}$ [Bou58, §6 No. 1, Theorem 1]. Moreover, we naturally identify ${\cal P}(S)$ with the space of all probability measures on $\overline{S}$ that are concentrated on $S$ .

Theorem 14 (The convex order for laws of random probability measures)

Let $S$ be a Polish space, let $\overline{S}$ be a metrizable compactification of $S$ , and let ${\cal C}_{\rm cv}\big{(}{\cal P}(\overline{S})\big{)}$ denote the space of all convex continuous functions $\phi:{\cal P}(\overline{S})\to{\mathbb{R}}$ . Then, for $\rho_{1},\rho_{2}\in{\cal P}({\cal P}(S))$ , the following statements are equivalent.

(i)

$\displaystyle\int\phi\,\mathrm{d}\rho_{1}\leq\int\phi\,\mathrm{d}\rho_{2}$ * for all $\phi\in{\cal C}_{\rm cv}\big{(}{\cal P}(\overline{S})\big{)}$ .* 2. (ii)

There exists an $S$ -valued random variable $X$ defined on some probability space $(\Omega,{\cal F},{\mathbb{P}})$ and sub- $\sigma$ -fields ${\cal H}_{1}\subset{\cal H}_{2}\subset{\cal F}$ such that $\displaystyle\rho_{i}={\mathbb{P}}\big{[}{\mathbb{P}}[X\in\,\cdot\,|{\cal H}_{i}]\in\,\cdot\,\big{]}$ $(i=1,2)$ .

If $\rho_{1},\rho_{2}\in{\cal P}({\cal P}(S))$ satisfy the equivalent conditions of Theorem 14, then we say that they are ordered in the convex order and denote this as $\rho_{1}\leq_{\rm cv}\rho_{2}$ . It follows from [MSS18, Lemma 15] that $\leq_{\rm cv}$ is a partial order; in particular, $\rho_{1}\leq_{\rm cv}\rho_{2}$ and $\rho_{2}\leq_{\rm cv}\rho_{1}$ imply $\rho_{1}=\rho_{2}$ .

Recall that in Subsection 1.1, we defined ${\cal P}({\cal P}(S))_{\mu}$ , which is $\big{\{}\rho\in{\cal P}({\cal P}(S)):\rho^{(1)}=\mu\big{\}}$ . We define $\overline{\mu}\in{\cal P}({\cal P}(S))_{\mu}$ by $\overline{\mu}:={\mathbb{P}}[\delta_{X}\in\,\cdot\,]$ , where $X$ has law $\mu$ . It is easy to see that the $n$ -th moment measures of $\overline{\mu}$ are given by (1.62), so our present notation is consistent with earlier notation introduced there. By [MSS18, formula (4.7)], the measures $\delta_{\mu},\overline{\mu}$ are the extremal elements of ${\cal P}({\cal P}(S))_{\mu}$ w.r.t. the convex order, i.e.,

[TABLE]

The following proposition is a continuous-time version of [MSS18, Prop 3].

Proposition 15 (Extremal solutions in the convex order)

If $(\rho^{i}_{t})_{t\geq 0}$ $(i=1,2)$ are solutions to the higher-level mean-field equation (1.71) such that $\rho^{1}_{0}\leq_{\rm cv}\rho^{2}_{0}$ , then $\rho^{1}_{t}\leq_{\rm cv}\rho^{2}_{t}$ for all $t\geq 0$ . If $\nu$ solves the RDE (1.54), then $\overline{\nu}$ solves the higher-level RDE (1.73) and there exists a solution $\underline{\nu}$ of (1.73) such that

[TABLE]

Here $\Rightarrow$ denotes weak convergence of measures on ${\cal P}(S)$ , equipped with the topology of weak convergence. Any solution $\rho\in{\cal P}({\cal P}(S))_{\nu}$ to the higher-level RDE (1.73) satisfies

[TABLE]

The following result, which we cite from [MSS18, Prop. 4], describes the higher-level RTPs associated with the solutions $\underline{\nu}$ and $\overline{\nu}$ of the higher-level RDE.

Proposition 16 (Higher-level RTPs)

Let $\nu$ be a solution of the RDE (1.54) and let $\underline{\nu}$ and $\overline{\nu}$ as in (1.76) be the corresponding minimal and maximal solutions to the higher-level RDE, with respect to the convex order. Let $(\bm{\omega}_{\mathbf{i}},X_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ be an RTP corresponding to $\gamma$ and $\nu$ and set

[TABLE]

Then $(\bm{\omega}_{\mathbf{i}},\xi_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ is an RTP corresponding to $\check{\gamma}$ and $\underline{\nu}$ . Also, $(\bm{\omega}_{\mathbf{i}},\delta_{X_{\mathbf{i}}})_{\mathbf{i}\in{\mathbb{T}}}$ is an RTP corresponding to $\check{\gamma}$ and $\overline{\nu}$ .

Proposition 16 gives a more concrete interpretation of the solutions $\underline{\nu}$ and $\overline{\nu}$ to the higher-level RDE from (1.76). Indeed, if $(\bm{\omega}_{\mathbf{i}},X_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ is an RTP corresponding to $\nu$ , then

[TABLE]

which corresponds to “perfect knowledge” about the state $X_{\varnothing}$ of the root, while

[TABLE]

corresponds to the knowledge about $X_{\varnothing}$ that is contained in the random variables $(\bm{\omega}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ . Since $X_{\varnothing}$ is a measurable function of $(\omega_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ if and only if its conditional law given $(\bm{\omega}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ equals $\delta_{X_{\varnothing}}$ , it follows from (1.78) and (1.79) that the RTP corresponding to $\nu$ is endogenous if and only if $\underline{\nu}=\overline{\nu}$ .

It is instructive to demonstrate the general theory on our concrete example of a system with cooperative branching and deaths. Recall that for $\alpha>4$ , the mean-field equation (1.15) has three fixed points $\nu_{\rm low},\nu_{\rm mid},\nu_{\rm upp}$ . We denote the corresponding minimal and maximal solutions to the higher-level RDE in the sense of (1.76) by $\underline{\nu}_{\rm\ldots}$ and $\overline{\nu}_{\rm\ldots}$ $(\ldots={\rm low},{\rm mid},{\rm upp})$ . The following theorem lifts the results from Proposition 12 about the bivariate equation to a higher level. Indeed, using the theorem below, it is easy to see that the measures $\overline{\nu}^{(2)}_{\rm low},\underline{\nu}^{(2)}_{\rm mid},\overline{\nu}^{(2)}_{\rm mid}$ and $\overline{\nu}^{(2)}_{\rm upp}$ from Proposition 12 are in fact the second moment measures of the measures $\overline{\nu}_{\rm low},\underline{\nu}_{\rm mid},\overline{\nu}_{\rm mid}$ and $\overline{\nu}_{\rm upp}$ .

Theorem 17 (Higher-level equation for cooperative branching)

Let $\nu_{\rm low},\nu_{\rm mid}$ , and $\nu_{\rm upp}$ denote the fixed points of the mean-field equation (1.15) defined above Proposition 12. Then we have for the corresponding minimal and maximal solutions to the higher-level RDE that

[TABLE]

For $\alpha>4$ , the higher-level RDE (1.73) has four solutions, namely

[TABLE]

Any solution $(\rho_{t})_{t\geq 0}$ to the higher-level mean-field equation (1.71) converges as $t\to\infty$ to one of the fixed points in (1.81), the respective domains of attraction being

[TABLE]

For $\alpha=4$ , there are two fixed points $\overline{\nu}_{\rm low}$ and $\overline{\nu}_{\rm upp}$ with respective domains of attraction

[TABLE]

while for $\alpha<4$ all solutions converge to $\overline{\nu}_{\rm low}$ .

Since a probability measure $\mu\in{\cal P}(\{0,1\})$ is uniquely characterized by $\mu(\{1\})\in[0,1]$ , there is a natural identification ${\cal P}(\{0,1\})\cong[0,1]$ . Let $\widehat{\mbox{\tt cob}}$ and $\widehat{\mbox{\tt dth}}$ denote the higher-level maps $\check{g}$ corresponding to $g={\mbox{\tt cob}},{\mbox{\tt dth}}$ , which using the identification ${\cal P}(\{0,1\})\cong[0,1]$ we view as maps $\widehat{\mbox{\tt cob}}:[0,1]^{3}\to[0,1]$ and $\widehat{\mbox{\tt dth}}:[0,1]^{0}\to[0,1]$ . One can check that

[TABLE]

Identifying ${\cal P}({\cal P}(\{0,1\}))\cong{\cal P}[0,1]$ , we can identify the measures $\overline{\nu}_{\rm low},\underline{\nu}_{\rm mid},\overline{\nu}_{\rm mid}$ , and $\overline{\nu}_{\rm upp}$ with probability laws on $[0,1]$ . Letting $\eta$ denote a random variable with law $\nu\in{\cal P}[0,1]$ , the higher-level RDE, written in the form (1.55), then reads

[TABLE]

where $\eta_{1},\eta_{2},\eta_{3}$ are independent copies of $\eta$ and $\chi$ is an independent Bernoulli random variable with ${\mathbb{P}}[\chi=1]=\alpha/(\alpha+1)$ . Theorem 17 says that for $\alpha>4$ , this equation has four solutions. Three “trivial” solutions $\overline{\nu}_{\rm low},\overline{\nu}_{\rm mid},\overline{\nu}_{\rm upp}$ that correspond to Bernoulli $\eta$ with parameters

[TABLE]

and a “nontrivial” solution $\underline{\nu}_{\rm mid}$ for which ${\mathbb{P}}[0<\eta<1]>0$ . In view of Proposition 16, we can interpret this nontrivial solution (viewed as a probability law on $[0,1]$ ) as

[TABLE]

where $(\bm{\omega}_{\mathbf{i}},X_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ is the RTP corresponding to $\nu_{\rm mid}$ . The following lemma summarizes some elementary facts about the law $\underline{\nu}_{\rm mid}$ . We note that by solving the $n$ -variate RDE for $n\geq 3$ , one should in principle be able to calculate higher moments of $\underline{\nu}_{\rm mid}$ , although the formulas quickly become unwieldy.

Lemma 18 (Nontrivial solution of the higher-level RDE)

Let $\alpha>4$ and let $\eta$ be a random variable with law $\underline{\nu}_{\rm mid}$ . Then

[TABLE]

Moreover,

[TABLE]

It is not too hard to obtain numerical data for $\underline{\nu}_{\rm mid}$ , see Figure 2. These data suggest that apart from the atom in [math], the measure $\underline{\nu}_{\rm mid}$ has a smooth density with respect to the Lebesgue measure, but we have no proof for this. We have tried to find an explicit formula for the density but have not been successful.

1.8 Lower and upper solutions

In this and the next subsection we collect a few further results on endogeny and the uniqueness of solutions to RDEs. In the present subsection, we show that the endogeny of the RTPs corresponding to $\nu_{\rm low}$ and $\nu_{\rm upp}$ follows from a general principle, discovered in [AB05], that says that RDEs that are defined by monotone maps always have a minimal and maximal solution with respect to the stochastic order, and that the RTPs corresponding to these solutions are always endogenous.

Let $S$ be a compact metrizable space that is equipped with a partial order $\leq$ that is closed in the sense that

[TABLE]

is a closed subset of $S^{2}$ , equipped with the product topology. Recall that a function $f$ from one partially ordered space into another is monotone if $x\leq y$ implies $f(x)\leq f(y)$ , and a subset $A$ of a partially ordered space is increasing if $A\ni x\leq y$ implies $y\in A$ . It is known that for two probability measures $\mu_{1},\mu_{2}\in{\cal P}(S)$ , the following statements are equivalent:

(i)

$\mu_{1}(A)\leq\mu_{2}(A)$ for all closed increasing $A\subset S$ . 2. (ii)

$\displaystyle\int f\,\mathrm{d}\mu_{1}\leq\int f\,\mathrm{d}\mu_{2}$ for all bounded continuous monotone $f:S\to{\mathbb{R}}$ . 3. (iii)

Two random variables $X_{1},X_{2}$ with laws $\mu_{1},\mu_{2}$ can be coupled such that $X_{1}\leq X_{2}$ a.s.

The equivalence of (ii) and (iii) is proved in [Lig85, Thm II.2.4]. The equivalence of (i) and (iii) holds more generally for Polish spaces, see [KKO77, Thm 1 (ii) and (vi)]. In the general setting of Polish spaces, the implications (iii) $\Rightarrow$ (i) and (iii) $\Rightarrow$ (ii) are trivial, but the implication (ii) $\Rightarrow$ (i) needs the additional assumption of monotone normality, see [HLL18, Prop. 3.6 and 3.11].

If $\mu_{1},\mu_{2}$ satisfy the above conditions, then one says that they are stochastically ordered, denoted as $\mu_{1}\leq\mu_{2}$ . This defines a partial order on ${\cal P}(S)$ ; in particular, by Lemma 50 below, $\mu_{1}\leq\mu_{2}\leq\mu_{1}$ implies $\mu_{1}=\mu_{2}$ .

The proposition below is a variant of [AB05, Lemma 15]. As in our usual setting, we assume that $S$ and $\Omega$ are Polish spaces, ${\mathbf{r}}$ is a nonzero finite measure on $\Omega$ , and $\gamma:\Omega\times S^{{\mathbb{N}}_{+}}\to S$ and $\kappa:\Omega\to{\mathbb{N}}$ are measurable functions such that (1.3) and (1.4) hold. If $S$ is equipped with a partial order, then we equip $S^{k}$ with the product partial order. Recall that Proposition 3 gives sufficient conditions for $T$ to be continuous w.r.t. the topology of weak convergence.

Proposition 19 (Lower and upper solutions to RDE)

Assume that $S$ is compact and equipped with a closed partial order. Assume that $S$ has minimal and maximal elements, denoted by [math] and $1$ . Assume $\gamma[\omega]$ is monotone for each $\omega\in\Omega$ and that the operator $T$ in (1.1) is continuous w.r.t. the topology of weak convergence. Then there exists solutions $\nu_{\rm low},\nu_{\rm upp}$ to the RDE (1.54) that are minimal and maximal with respect to the stochastic order, in the sense that any solution $\nu$ to the RDE (1.54) must satisfy

[TABLE]

where $\Rightarrow$ denotes weak convergence. Moreover, if $(\mu^{\rm low}_{t})_{t\geq 0}$ and $(\mu^{\rm upp}_{t})_{t\geq 0}$ denote the solutions to the mean-field equation (1.2) with initial states $\mu^{\rm low}_{0}=\delta_{0}$ and $\mu^{\rm upp}_{0}=\delta_{1}$ , then

[TABLE]

Finally, the RTPs corresponding to $\nu_{\rm low}$ and $\nu_{\rm upp}$ are endogenous.

We can view the solutions $\nu_{\rm low}$ and $\nu_{\rm upp}$ to the RDE (1.54) as mean-field versions of the lower and upper invariant laws of monotone particle systems; compare [Lig85, Thm III.2.3].

In our example of a system with cooperative branching, the maps cob and dth are monotone, so Proposition 19 is applicable. Since the measures we called $\nu_{\rm low}$ and $\nu_{\rm upp}$ before are the $t\to\infty$ limits of the solutions of the mean-field equation started in $\delta_{0}$ and $\delta_{1}$ , our earlier notation agrees with the more general notation of Proposition 19. The endogeny of the RTPs corresponding to $\nu_{\rm low}$ and $\nu_{\rm upp}$ , which before we proved based on an analysis of the bivariate equation, using Proposition 12 and Theorem 10, alternatively follows from Proposition 19.

1.9 Conditions for uniqueness

In the present subsection, we prove some results of varying generality that allow one to conclude that a given RDE has a unique solution. In our example of a system with cooperative branching and deaths, this happens if and only if $\alpha<4$ . We will see that there are some general results that can be applied to prove uniqueness in the whole regime $\alpha<4$ . We also make a connection with a general duality for monotone particle systems described in [SS18]. Although duality plays only a minor role in our paper, the original motivation for the work that led to it was to understand this duality in the mean-field limit.

We return to our usual set-up from Subsection 1.1 with $S$ and $\Omega$ Polish spaces and $\gamma,\kappa$ and ${\mathbf{r}}$ satisfying (1.3) and (1.4). We also recall the random subtrees ${\mathbb{S}}_{t}\subset{\mathbb{S}}\subset{\mathbb{T}}$ defined in (1.43) as well as the fact that ${\mathbb{S}}_{t}$ for any $t\geq 0$ are a.s. finite by (1.4). The tree ${\mathbb{S}}$ is the family tree of the branching process $(\nabla{\mathbb{S}}_{t})_{t\geq 0}$ . In view of this, by well-known facts about branching processes, ${\mathbb{S}}$ is a.s. finite if and only if

[TABLE]

Recall that $G_{t}=G_{{\mathbb{S}}_{t}}$ , where for any finite subtree ${\mathbb{U}}\subset{\mathbb{S}}$ that contains the root, $G_{\mathbb{U}}:S^{\nabla{\mathbb{U}}}\to S$ is the map defined in (1.46). If ${\mathbb{S}}$ is a.s. finite, then $\nabla{\mathbb{S}}_{t}=\emptyset$ for $t$ sufficiently large and hence $G_{t}:S^{\nabla{\mathbb{S}}_{t}}\to S$ is eventually constant.

More generally, if ${\mathbb{U}}$ is finite subtree of ${\mathbb{S}}$ that contains the root $\varnothing$ , then we say that ${\mathbb{U}}$ is a root determining subtree if the map $G_{\mathbb{U}}:S^{\nabla{\mathbb{U}}}\to S$ is constant. Note that this can happen even if $\nabla{\mathbb{U}}\neq\emptyset$ . It is easy to see that if ${\mathbb{V}}\subset{\mathbb{U}}$ and ${\mathbb{V}}$ is root determining, then the same is true for ${\mathbb{U}}$ . We say that ${\mathbb{U}}$ is a minimal root determining subtree if ${\mathbb{U}}$ is root determining but there exists no ${\mathbb{V}}\subset{\mathbb{U}}$ with ${\mathbb{V}}\neq{\mathbb{U}}$ that is root determining. By our previous remark, it suffices to check this for such ${\mathbb{V}}$ that differ from ${\mathbb{U}}$ by a single element.

Lemma 20 (Root determining subtrees)

The following conditions are equivalent:

(i)

There a.s. exists a $t<\infty$ such that $G_{s}$ is constant for all $s\geq t$ . 2. (ii)

${\mathbb{S}}$ * a.s. contains a root determining subtree.* 3. (iii)

${\mathbb{S}}$ * a.s. contains a minimal root determining subtree.*

If ${\mathbb{U}}$ is a subtree of ${\mathbb{S}}$ , then we denote by $\Xi_{\mathbb{U}}$ the set of all $x=(x_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{U}}\cup\nabla{\mathbb{U}}}$ that satisfy (1.45). We say that ${\mathbb{U}}$ is uniquely determined if $x,y\in\Xi_{\mathbb{U}}$ imply $x_{\mathbf{i}}=y_{\mathbf{i}}$ $(\mathbf{i}\in{\mathbb{U}})$ . The following lemma is inspired by [AB05, Lemma 14] who showed that (i) implies that the RDE (1.54) has a unique solution and the corresponding RTP is endogenous.

Lemma 21 (Uniquely determined subtrees)

Between the following four conditions, one has the implications (i) $\Rightarrow$ (ii) $\Rightarrow$ (iii) $\Rightarrow$ (iv) and (ii) $\Rightarrow$ (v). If $S$ is finite, then moreover (iii) $\Rightarrow$ (ii), and if $S=\{0,1\}$ , then (ii) $\Rightarrow$ (i).

(i)

${\mathbb{S}}$ * a.s. contains a finite, uniquely determined subtree that contains the root $\varnothing$ .* 2. (ii)

The equivalent conditions of Lemma 20 are satisfied. 3. (iii)

${\mathbb{S}}$ * is a.s. uniquely determined.* 4. (iv)

The RDE (1.54) has at most one solution and any corresponding RTP is endogenous. 5. (v)

The RDE (1.54) has a solution $\nu$ that is globally attractive in the sense that any solution $(\mu_{t})_{t\geq 0}$ to (1.2) satisfies $\|\mu_{t}-\nu\|\underset{{t}\to\infty}{\longrightarrow}0$ , where $\|\,\cdot\,\|$ denotes the total variation norm.

The following lemma illustrates these ideas on our example of a system with cooperative branching and deaths. Below, $|{\mathbb{U}}\cap\{\mathbf{i}2,\mathbf{i}3\}|$ denotes the cardinality of ${\mathbb{U}}\cap\{\mathbf{i}2,\mathbf{i}3\}$ . See Figure 3 for an example.

Lemma 22 (The uniqueness regime)

Let $S=\{0,1\}$ and ${\cal G}:=\{{\mbox{\tt cob}},{\mbox{\tt dth}}\}$ , and let $\pi$ be as in (1.14). Then (1.93) is satisfied if and only if $\alpha\leq{\textstyle\frac{{1}}{{2}}}$ , while conditions (i)–(iii) of Lemma 21 are satisfied if and only if $\alpha<4$ . Moreover, a finite subtree ${\mathbb{U}}\subset{\mathbb{S}}$ is a minimal root determining subtree if and only if

[TABLE]

Lemma 22 shows that in our example of a system with cooperative branching and deaths, the conditions of Lemma 20 are in fact equivalent to uniqueness of solutions to the RDE. As the next lemma shows, this is a consequence of monotonicity.

Lemma 23 (Uniqueness for monotone systems)

Assume that $S$ is a finite partially ordered set that contains a minimal and maximal element, and assume that $\gamma[\omega]$ is monotone for each $\omega\in\Omega$ . Then the RDE (1.54) has a unique solution if and only if the equivalent conditions of Lemma 20 are satisfied.

In the remainder of this subsection, we focus on the case that $S=\{0,1\}$ and $\gamma[\omega]$ is monotone for all $\omega\in\Omega$ , which allows us to make a connection to a general duality for monotone particle systems described in [SS18]. Recall that a set $A\subset\{0,1\}^{k}$ is increasing if $A\ni x\leq y$ implies $y\in A$ . A minimal element of $A$ is an $y\in A$ such that $A\ni x\leq y$ implies $x=y$ . If $K$ is a nonempty finite set and $G:\{0,1\}^{K}\to\{0,1\}$ is a monotone map, then the inverse image $G^{-1}(\{1\})$ is an increasing set. We set

[TABLE]

Then

[TABLE]

These formulas remain true when $K=\emptyset$ , provided we define $\{0,1\}^{\emptyset}:=\{\varnothing\}$ and we let $Y_{G}:=\{\varnothing\}$ if $G(\varnothing)=1$ and $Y_{G}:=\emptyset$ if $G(\varnothing)=0$ .

Recall from Section 1.4 that $(\nabla{\mathbb{S}}_{t},G_{t})_{t\geq 0}$ is a Markov process. If $S=\{0,1\}$ and $\gamma[\omega]$ is monotone for all $\omega\in\Omega$ , then the random map $G_{t}:\{0,1\}^{\nabla{\mathbb{S}}_{t}}\to\{0,1\}$ is monotone for each $t\geq 0$ . In view of this, by (1.96), $G_{t}$ is uniquely characterized by $Y_{G_{t}}$ and hence $(\nabla{\mathbb{S}}_{t},Y_{G_{t}})_{t\geq 0}$ is a Markov process too. For a system with cooperative branching and deaths, this process has been defined before in [Mac17, Section I.2.1.2]. As explained in more detail there, it can be seen as the mean-field limit of a general dual for monotone particle systems described in [SS18, Section 5.2].

Let $S=\{0,1\}$ , let $\gamma[\omega]$ be monotone for all $\omega\in\Omega$ , and let ${\mathbb{U}}$ be a subtree of ${\mathbb{S}}$ that contains the root $\varnothing$ . Borrowing terminology from percolation theory, we say that ${\mathbb{O}}$ is a open subtree of ${\mathbb{U}}$ if $\varnothing\in{\mathbb{O}}\subset{\mathbb{U}}\cup\nabla{\mathbb{U}}$ and

[TABLE]

where we use the convention that $1_{A_{\mathbf{i}}}:=\varnothing$ if $\kappa(\bm{\omega}_{\mathbf{i}})=0$ .

Lemma 24 (Open subtrees)

Assume that $S=\{0,1\}$ and $\gamma[\omega]$ is monotone for all $\omega\in\Omega$ . Then

[TABLE]

If moreover $\gamma[\omega](0,\ldots,0)=0$ for each $\omega\in\Omega$ , then

[TABLE]

We note that formula (1.98) can be generalized to more general finite partially ordered sets $S$ , see Lemma 64 below. Again, it will be useful to illustrate our definitions on the concrete example of a system with cooperative branching and death. To make the example more interesting, we add a birth map ${\mbox{\tt bth}}:S^{0}\to S$ , which is defined similarly to the death map as

[TABLE]

The following lemma describes open subtrees for a system described by the maps ${\mbox{\tt cob}},{\mbox{\tt dth}},{\mbox{\tt bth}}$ ; see Figure 4 for an illustration.

Lemma 25 (Systems with cooperative branching, deaths, and births)

Let $S=\{0,1\}$ , ${\cal G}:=\{{\mbox{\tt cob}},{\mbox{\tt dth}},{\mbox{\tt bth}}\}$ , with

[TABLE]

Let ${\mathbb{U}}$ be a subtree of ${\mathbb{S}}$ that contains the root and let ${\mathbb{O}}\subset{\mathbb{U}}\cup\nabla{\mathbb{U}}$ satisfy $\varnothing\in{\mathbb{O}}$ . Then ${\mathbb{O}}$ is an open subtree of ${\mathbb{U}}$ if and only if for all $\mathbf{i}\in{\mathbb{O}}\cap{\mathbb{U}}$ ,

[TABLE]

We can think of open subtrees as a generalization of the open paths from oriented percolation. Outside of a mean-field setting, using ideas from [SS18, Section 5.2], one can characterize the upper invariant law of quite general monotone particle system in terms of “open structures” that in general are neither paths nor trees.

2 Discussion

This section is divided into four subsections. In Subsection 2.1, we discuss the relation of our work to [BCH18], who in parallel to our work have studied Moran models that generalize our running example of a system with cooperative branching and deaths. In Subsection 2.2, we compare our results and methods with the existing literature on mean-field limits. In Subsection 2.3, we state open problems and we conclude in Subsection 2.4 with an outline of the proofs.

2.1 A Moran model with frequency-dependent selection

Let ${\mbox{\tt bra}}:\{0,1\}^{2}\to{\mbox{\tt bra}}$ be the branching map defined as

[TABLE]

Consider a system with $S=\{0,1\}$ , ${\cal G}:=\{{\mbox{\tt cob}},{\mbox{\tt bra}},{\mbox{\tt dth}},{\mbox{\tt bth}}\}$ , with rates

[TABLE]

$\gamma\geq 0$ , $s>0$ , $\nu_{0},\nu_{1}\geq 0$ with $\nu_{0}+\nu_{1}=1$ , and $u>0$ . If $(\mu_{t})_{t\geq 0}$ solves the corresponding mean-field equation (1.11), then $p_{t}:=\mu_{t}(\{1\})$ solves the ODE (compare (1.36))

[TABLE]

This equation has an interpretation in terms of a Moran model describing a fixed population of $N$ individuals which can be of two types, 0 and 1, where type 1 is fitter than type 0. The parameter $\gamma$ is the frequency dependent selection rate, $s$ is the selection rate, $u$ is the mutation rate, and $\nu_{0},\nu_{1}$ are mutation probabilities. The frequency dependent selection is of a type that is especially appropriate to describe an advantageous, (partially) recessive gene in a diploid population.

In parallel to our work, Moran models of this form have been studied by Ellen Baake, Fernando Cordero, and Sebastian Hummel in [BCH18]. A notational difference between their work and the discussion here is that they denote the fitter type by 0, so their [BCH18, formula (2.1)] is our (2.3) rewritten in terms of $y(t)=1-p_{t}$ and with the roles of $\nu_{0}$ and $\nu_{1}$ reversed. They prove that (2.3) describes the mean-field limit of a class of Moran models [BCH18, Prop. 4.1] and that in the limit $N\to\infty$ , the genealogy of a single individual is described by an Ancestral Selection Graph (ASG) ${\cal A}_{t}$ , which in our notation corresponds to

[TABLE]

i.e., this is the random tree with maps attached to its branch points depicted in Figure 1.

The authors of [BCH18] define a duality function $H({\cal A}_{t},p)$ which corresponds to the duality function in (1.52) after the identification $\mu(\{1\})=p$ . (Here we have slightly rephrased things compared to the different conventions in [BCH18], where 0 denotes the fitter type and $y$ is the frequency of the unfit type.) In [BCH18, Lemma 4.4], they show that $H({\cal A}_{t},p)$ can be calculated by concatenating the higher-level maps $\check{\gamma}[\bm{\omega}_{\mathbf{i}}]$ with $\mathbf{i}\in{\mathbb{S}}_{t}$ . For example, the equation $y=y_{1}[y_{2}+y_{3}-y_{2}y_{3}]$ in [BCH18, Lemma 4.4 (4)] can be rewritten in terms of $p=1-y$ as $p=\widehat{\mbox{\tt cob}}(p_{1},p_{2},p_{3})$ with $\widehat{\mbox{\tt cob}}$ as in (1.84).

In [BCH18, Section 5], it is shown that the ASG ${\cal A}_{t}$ can be simplified a lot, while retaining all information necessary to calculate the duality function $H({\cal A}_{t},p)$ . This is done in three steps, I, IIa, and IIb.

In the step I, the ASG is pruned. This is a process in which parts of the tree that are irrelevant for the map $G_{t}$ are cut off. In particular, if the function $G_{t}$ is constant, then the pruned $G_{t}$ consists of a single edge ending in one of the maps dth or bth. In the remaining case, the pruned ASG is a finite tree where each branch point is marked with one of the maps cob and bra.

In steps IIa and IIb, the pruned ASG is stratified. In step IIa, the tree structure is changed in such a way that starting at the root, one first sees a ternary tree containing only the map cob, and then at the leaves of this ternary tree, there are attached binary trees containing only the map bra. In step IIb, each binary tree is replaced by an integer $n\geq 0$ which records the number of leaves of the binary tree.

The result of this is a simplified process, the stratified ASG ${\cal T}_{t}$ , which contains all necessary information about the ASG ${\cal A}_{t}$ in the sense that there exists a function ${\cal H}({\cal T}_{t},p)$ such that ${\cal H}({\cal T}_{t},p)=H({\cal A}_{t},p)$ [BCH18, Thm 5.13]. In particular, solutions of (2.3) can be represented as $p_{t}={\mathbb{E}}[{\cal H}({\cal T}_{t},p_{0})]$ ([BCH18, Thm 6.2] (compare (1.51)).

One can now check (compare Lemma 48 below) that $\rho_{t}:={\mathbb{P}}[{\cal H}({\cal T}_{t},p)\in\,\cdot\,]$ solves the higher-level mean-field equation with initial state $\rho_{0}=\delta_{p}$ , where we use the identification ${\cal P}(\{0,1\})\cong[0,1]$ . In [BCH18, Thm 6.5], it is observed that $M_{t}:={\cal H}({\cal T}_{t},p_{0})$ is a bounded sub- or supermartingale for each $p_{0}\in[0,1]$ and hence converges to an a.s. limit ${\cal H}_{\infty}(p_{0})$ . In [BCH18, Prop. 6.6], it is proved that if $p_{0}$ is not an unstable fixed point of (2.3), then ${\cal H}_{\infty}(p_{0})$ is a Bernoulli random variable with parameter $\lim_{t\to\infty}p_{t}$ .

Our Propositions 15 and 16 imply that if $p_{0}$ is a fixed point of (2.3), then ${\cal H}_{\infty}(p_{0})$ is a Bernoulli random variable if and only if the RTP corresponding to $p_{0}$ is endogenous. Thus, [BCH18, Prop. 6.6] implies that for the model in (2.2), RTPs corresponding to stable fixed points are always endogenous. Since all stable fixed points of (2.3) are in fact lower or upper solutions, this alternatively also follows from our Proposition 19.

In the special case $s=0$ and $\nu_{1}=0$ , [BCH18, Prop. 6.6] follows alternatively from our Theorem 17, which completely describes the long-time behaviour of solutions to the higher-level mean-field equation not just for initial states of the form $\rho_{0}=\delta_{p}$ , but for general initial states.

2.2 Mean-field limits

If $N$ Markov processes interact in a way that is symmetric under permutations of the $N$ coordinates, then it is frequently possible to obtain a nontrivial limit as $N\to\infty$ . Such limits are generally called mean-field limits. In the mean-field limit, the individual processes behave asymptotically independently, but with transition probabilities that depend on the average behavior of all processes. For systems of interacting diffusions, this principle was demonstrated by McKean in his analysis of the Vlasov equation [McK66]. Consequently, mean-field limits are also called McKean-Vlasov limits. There exists an extensive literature on the topic. Most work has focused on interacting diffusions, but jump processes have also been studied [ST85, ADF18]. An elementary introduction to mean-field limits for interacting particle systems is given in [Swa17, Chapter 3].

In a biological setting, well-mixing populations converge in the mean-field limit to the solution of a deterministic ODE. Similarly, spatial populations with strong local mixing can be expected to converge, after an appropriate rescaling, to the solution of a determinstic PDE. For interacting particle systems whose dynamics have an exclusion process component with a large rate, this intuition was made rigorous by De Masi, Ferrari and Lebowitz [DFL86, Thm 2]. They state their theorem only for processes whose state space $S$ consists of two points, and only prove the theorem for one particular one-dimensional example, but sketch how the proof should be adapted to the general case. In [DN94, Thm 1], a version of the theorem is stated where $S$ can be any finite set; it is claimed that the proof is again the same.

In our running example of a particle system with cooperative branching and deaths, the limiting PDE takes the form

[TABLE]

This PDE was used in [Nob92] to derive asymptotic properties of the associated spatial particle system with strong mixing. We can view (2.5) as a spatial version of the ODE (1.36); in particular, if $p_{0}(x)=p_{0}$ does not depend on $x$ , then $p_{t}=p_{t}(x)$ solves (1.36).

The intuition behind (2.5), and more general PDEs of this type, is easily explained. In the strong mixing limit, the genealogy of a single site should be described by a branching process as in Figure 1 where in addition, each particle has a position in ${\mathbb{R}}$ , which moves according to an independent Brownian motion. Convergence to the PDE should then follow from, on the one hand, convergence of the genealogies to a system of branching Brownian motions with random maps attached to their branching events, and, on the other hand, a representation in the spirit of Theorem 6 of solutions of the PDE (2.5) in terms of such a system of branching Brownian motions.

The proof of [DFL86, Thm 2] is indeed based on this sort of dual approach, although one would wish that they had given a more explicit statement of the stochastic representation of solutions of their general PDE. Our proof of Theorem 5 follows the same strategy, i.e., we first prove the stochastic representation of solutions to the mean-field equation (Theorem 6) and then use this to prove our convergence result (Theorem 5).

2.3 Open problems

In the present paper, we have adapted results from [AB05, MSS18] about discrete-time Recursive Tree Processes and endogeny to the continuous-time setting, and applied our general results on a concrete system with cooperative branching and deaths. Among other things, we proved that for $\alpha>4$ , the RTPs corresponding to $\nu_{\rm low}$ and $\nu_{\rm upp}$ are endogenous but the RTP corresponding to $\nu_{\rm mid}$ is not. The proof was based on an analysis of the bivariate mean-field equation. Here, it was convenient to be able to analyse a differential equation, as an analysis of the associated discrete-time bivariate evolution would have been possible, but more messy.

Our work leaves a number of questions unanswered, both in the general setting and more specifically for our running example with ${\cal G}:=\{{\mbox{\tt cob}},{\mbox{\tt dth}}\}$ and $\pi$ as in (1.14). Concerning the latter, we pose the following questions.

Open Problem 1 Not every measure $\mu^{(n)}\in{\cal P}_{\rm sym}\big{(}\{0,1\}^{n}\big{)}$ is the $n$ -th moment measure of a measure $\rho\in{\cal P}\big{(}{\cal P}(\{0,1\})\big{)}$ . Determine all symmmetric solutions of the $n$ -variate RDE, for general $n\geq 3$ , and their domains of attraction.
Open Problem 2 Same as Open Problem 1 but without the symmetry assumption and for general $n\geq 2$ .
Open Problem 3 Prove that apart from the atom at zero, the law $\underline{\nu}_{\rm mid}$ , viewed as a probability law on $[0,1]$ , has a smooth density with respect to the Lebesgue measure.
Open Problem 4 Determine the aymptotics of the distribution function $F$ of $\underline{\nu}_{\rm mid}$ near [math] and $1$ .
Open Problem 5 For the more general model in (2.2), is it true that unstable fixed points of the mean-field equation that separate the domains of attraction of two stable fixed points correspond to nonendogenous RTPs? Is the picture for the higher-level RDE the same?

Partly inspired by our concrete example, we ask the following problems in the general setting.

Open Problem 6 Can (1.4) be relaxed to allow for branching processes $(\nabla{\mathbb{S}}_{t})_{t\geq 0}$ that are nonexplosive but have infinite mean?
Question 7 Are there general results linking the (in)stability of fixed points of the mean-field equation to (non)endogeny of the related RTP?
Question 8 In our example, the higher-level RDE has two solutions $\underline{\nu}_{\rm mid}$ and $\overline{\nu}_{\rm mid}$ with mean $\nu_{\rm mid}$ , of which the former is stable and the latter is unstable. Is this a general phenomenon in the nonendogenous case? Can one prove nonendogeny of an RTP corresponding to a solution $\nu$ of the RDE by showing that $\overline{\nu}$ is unstable?
Question 9 Are there examples of higher-level RDEs that have solutions $\rho\not\in\{\underline{\nu},\overline{\nu}\}$ ?
Open Problem 10 Is the higher-level RTP $(\bm{\omega}_{\mathbf{i}},\xi_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{S}}}$ from Proposition 16 always endogenous?

Finally, we mention the problem of proving nonendogeny for the frozen percolation of [Ald00], which to our knowledge is still open. Although we did not attempt to solve this problem here, one might hope that the methods of the present paper can provide a useful new point of view on this old problem.

2.4 Outline of the proofs

In the remainder of the paper, we prove all results stated so far, except for Theorems 10 and 14 as well as Proposition 16, which we cite from [MSS18, Thm 1, Thm 13, and Prop 4].

In Section 3 we prove Theorem 1, Propositions 2 and 3, and Lemma 4, which state elementary properties of solutions of the mean-field equations (1.22) and (1.2), as well as Theorem 6, which gives a stochastic representation of solutions of the mean-field equation in terms of finite recursive tree processes. In Section 4, we use this stochastic representation to prove Theorem 5 about convergence of finite systems to a solution of the mean-field equation.

In Section 5, we prove our main results about RTPs with continuous time, which are largely analogous to known results from the discrete-time setting. Basic results are Lemma 7 and Proposition 9, as well as Lemma 8 which deals with discrete time and is a slight reformulation of known results. Following [AB05], Theorem 11 links the $n$ -variate equation to endogeny, while Propositions 13 and 15 are concerned with the higher-level equation, and closely follow ideas from [MSS18].

In Section 6 we prove some additional results about RTPs, first Proposition 19, which generalizes [AB05, Lemma 15] and shows that upper and lower solutions of a monotonous RDE are always endogenous, and then Lemmas 20, 21, 23, and 24 which give conditions for uniqueness in a general setting and then more specifically for monotone systems.

In Section 7, finally, we have collected all proofs that deal specifically with our running example of a system with cooperative branching and deaths. The first such result is Proposition 12 about the bivariate equation, which is a two dimensional ODE for which by elementary means we find all fixed points and their domains of attraction. By combining Proposition 12 with ideas involving the convex order we then prove the much stronger Theorem 17 which gives all fixed points and domains of attraction for the higher-level equation. The picture is then completed by the proofs of Lemma 18, which gives some properties of the nontrivial fixed point of the higher-level equation, as well as Lemmas 22 and 25 which illustrate ideas from Section 6 in the concrete set-up of our example.

3 The mean-field equation

In this section, we prove Theorems 1 and 6, which state that the mean-field equation (1.2) has a unique solution and can be represented in terms of a random tree generated by a branching process, with random maps attached to its vertices. In addition, we also prove Propositions 2 and 3, as well as Lemma 4.

In Subsection 3.1, we start with some preliminaries, showing, in particular, that the integral in (1.16) is well-defined, and Lemma 4, which says that mean-field equations of the form (1.22) can be rewritten in the simpler form (1.2).

Next, in Subsection 3.2, we prove uniqueness of solutions of (1.2), which yields the uniqueness parts of Theorem 1. To prove existence, in Subsection 3.3, we show that the right-hand side of (1.49) solves (1.2), which not only completes the proof of Theorem 1 but also yields the stochastic representation that is Theorem 6.

The proofs of Propositions 2 and 3, finally, can be found in Subsection 3.4.

3.1 Preliminaries

Recall that we interpret the mean-field equation (1.2) as in (1.16), where, by (1.12),

[TABLE]

Since by assumption, $\gamma[\omega](x_{1},\ldots,x_{k})$ is jointly measurable in $\omega$ and $x_{1},\ldots,x_{k}$ , the right-hand side of (3.1) is measurable as a function of $\omega$ and hence the integral in (1.16) is well-defined.

Proof of Lemma 4 Recall from Subsection 1.3 that the basic ingredients that go into the equation (1.22) are the measure space $(\Omega^{\prime},{\mathbf{q}})$ and function $\lambda$ , as well as, for each $\omega\in\Omega^{\prime}$ and $1\leq i\leq\lambda(\omega)$ , the function $\gamma_{i}[\omega]$ and set $K_{i}(\omega)$ . Also, $\kappa_{i}(\omega):=|K_{i}(\omega)|$ . In terms of these basic ingredients we need to define $\Omega,{\mathbf{r}},\kappa$ , and $\gamma$ as in Subsection 1.1 so that (1.22) takes the simpler form (1.2).

Since we want to replace the integral and sum in (1.22) by a single integral, we put

[TABLE]

where as before $\Omega^{\prime}_{l}:=\{\omega\in\Omega^{\prime}:\lambda(\omega)=l\}$ and $[l]:=\{1,\ldots,l\}$ , and we equip $\Omega$ with the measure

[TABLE]

In general, $\Omega$ need not be a Polish space, as required in Subsection 1.1. We will fix this problem at the end of our proof, but for the sake of the presentation we neglect it for the moment being. We define $\kappa:\Omega\to{\mathbb{N}}$ as in Subsection 1.1 by $\kappa(\omega,i):=\kappa_{i}(\omega)$ , where the right-hand side is the function from Subsection 1.3. We write

[TABLE]

Since $\gamma_{i}[\omega](x_{1},\ldots,x_{\lambda(\omega)})$ depends only on coordinates in $K_{i}(\omega)$ , there exists a function $\gamma[\omega,i]:S^{\kappa(\omega,i)}\to S$ such that

[TABLE]

Note that $T_{\gamma_{i}[\omega]}=T_{\gamma[\omega,i]}$ by (1.12). As in (1.3), we can associate $\gamma[\omega,i]$ with a function that is defined on $S^{{\mathbb{N}}_{+}}$ but depends only on the first $\kappa(\omega,i)$ coordinates. We take this as our definition of the function $\gamma:\Omega\times S^{{\mathbb{N}}_{+}}\to S$ from Subsection 1.1. It follows from (1.20) that $\gamma[\omega,i](x)$ is jointly measurabe as a function of $(\omega,i)$ and $x$ .

Replacing the integral and sum in (1.22) by a single integral over ${\mathbf{r}}$ as defined in (3.3), using the fact that $T_{\gamma_{i}[\omega]}=T_{\gamma[\omega,i]}$ we see that (1.22) can be rewritten as

[TABLE]

which coincides with (1.2). The condition that ${\mathbf{r}}$ should be a finite measure translates to (1.23) (i), while the condition (1.4), written in terms of ${\mathbf{q}}$ , becomes (1.23) (ii). Moreover, if ${\mathbf{q}}$ satisfies (1.24), then ${\mathbf{r}}$ satisfies (1.19).

We still have to fix the problem that $\Omega$ , as defined in (3.2), is in general not a Polish space. There are several possible ways to fix this.444For example, we can strengthen our assumptions on $\lambda$ in the sense that $\{\omega:\lambda(\omega)=l\}$ is a $G_{\delta}$ -set for each $l\in{\mathbb{N}}_{+}$ , or we can relax our assumptions on $\Omega$ allowing it to be a Lusin space, instead of just a Polish space, throughout. The solution we will choose is to replace $\Omega$ by the Polish space

[TABLE]

were $\overline{\Omega^{\prime}_{l}}$ denotes the closure of $\Omega^{\prime}_{l}$ in $\Omega^{\prime}$ . We view ${\mathbf{r}}$ as a measure on $\overline{\Omega}$ that is concentrated on $\Omega$ and extend $\kappa$ and $\gamma$ in a measurable way to the larger space, which is possible since $\Omega$ is a measurable subset of $\overline{\Omega}$ . Since ${\mathbf{r}}$ is concentrated on $\Omega$ , it does not matter how we extend $\kappa$ and $\gamma$ as this has no effect on (3.6).

3.2 Uniqueness

In the present section, we prove that under the assumption (1.4), solutions to (1.2) are unique, which settles the uniqueness part of Theorem 1.

Below, we let ${\cal M}(S)$ denote the space of all finite signed measures on $S$ . The total variation norm has already been mentioned several times. There are two conventional definitions, which differ by a factor 2. We will use the definition

[TABLE]

where the supremum runs over all measurable functions $f:S\to[-1,1]$ . If $X,Y$ are $S$ -valued random variables, then it is easy to see that $\|\mu-\nu\|\leq{\mathbb{P}}[X\neq Y]$ . Conversely, it is well-known [Lin92, page 19] that if $\mu,\nu\in{\cal P}(S)$ , then it is possible to couple $S$ -valued random variables $X,Y$ in such a way that

[TABLE]

Lemma 26 (Lipschitz continuity)

Let $g:S^{k}\to S$ be measurable and let $T_{g}$ be defined as in (1.12). Then

[TABLE]

Moreover, if $T$ is defined as in (1.1), then

[TABLE]

**Proof **By (3.9) we can find an $S^{2}$ -valued random variable $(X,Y)$ such that $\|\mu-\nu\|={\mathbb{P}}[X\neq Y]$ . Let $(X_{1},Y_{1}),\ldots,(X_{k},Y_{k})$ be i.i.d. copies of $(X,Y)$ . Then, by (1.12),

[TABLE]

This proves (3.10). Formula (3.11) follows by integrating over $\omega$ .

Our next lemma gives equivalent formulations of the mean-field equation (1.2), that will also be useful in the next subsection where we prove existence of solutions. Below, we interpret an integral of a measure-valued integrand in the usual way, i.e., $\int_{0}^{t}\!\mathrm{d}s\,\mu_{s}$ denotes the measure defined by

[TABLE]

for any bounded measurable $\phi:S\to{\mathbb{R}}$ .

Lemma 27 (Equivalent formulations of the mean-field equation)

Assume (1.4). Let ${[0,\infty)}\ni t\mapsto\mu_{t}\in{\cal P}(S)$ be measurable. Then of the following conditions, (i) implies (ii) and (iii). If ${[0,\infty)}\ni t\mapsto\mu_{t}\in{\cal P}(S)$ is continuous with respect to the total variation norm, then all three conditions are equivalent.

(i)

For each bounded measurable $\phi:S\to{\mathbb{R}}$ , the function $t\mapsto\langle\mu_{t},\phi\rangle$ is continuously differentiable and $\displaystyle{\textstyle\frac{{\partial}}{{\partial{t}}}}\mu_{t}=|{\mathbf{r}}|\big{\{}T(\mu_{t})-\mu_{t}\}$ $(t\geq 0)$ . 2. (ii)

$\displaystyle\mu_{t}=\mu_{0}+|{\mathbf{r}}|\int_{0}^{t}\!\mathrm{d}s\,\big{\{}T(\mu_{s})-\mu_{s}\}$ * $(t\geq 0)$ .* 3. (iii)

$\displaystyle\mu_{t}=e^{-|{\mathbf{r}}|t}\mu_{0}+|{\mathbf{r}}|\int_{0}^{t}\!\mathrm{d}s\,e^{-|{\mathbf{r}}|s}\,T(\mu_{t-s})$ * $(t\geq 0)$ .*

**Proof **Integrating the equation in (i) from time 0 until time $t$ , we see that (i) implies (ii). Also, we can equivalently write the equation in (i) as

[TABLE]

Integrating from time 0 until time $t$ now yields

[TABLE]

Multiplying by $e^{-|{\mathbf{r}}|t}$ and substituting $s\mapsto t-s$ in the integral then yields the equation in (iii).

If $t\mapsto\mu_{t}\in{\cal P}(S)$ is continuous with respect to the total variation norm, then Lemma 26 together with (1.4) imply that also $t\mapsto T(\mu_{t})\in{\cal P}(S)$ is continuous with respect to the total variation norm. It follows that $t\mapsto\langle\mu_{t},\phi\rangle$ and $t\mapsto\langle T(\mu_{t}),\phi\rangle$ are continuous for each bounded measurable $\phi:S\to{\mathbb{R}}$ . As a result, the right-hand side of (ii), integrated against any bounded measurable $\phi$ , is continuously differentiable as a function of $t$ , and (ii) implies (i). By the same argument, rewriting (iii) as (3.15) and differentiating, we see that (iii) implies (i).

We now prove the promised uniqueness of solutions to (1.2). Proposition 2, which will be proved in Subsection 3.4 below, shows that the constant $L$ from (3.17) is not optimal and can be replaced by the constant $K$ from (1.18).

Lemma 28 (Uniqueness)

Let $(\mu_{t})_{t\geq 0}$ and $(\nu_{t})_{t\geq 0}$ be solutions of the mean-field equation (1.2). Then

[TABLE]

where

[TABLE]

**Proof **Equation (ii) of Lemma 27 implies that

[TABLE]

where $L=|{\mathbf{r}}|+\int_{\Omega}{\mathbf{r}}(\mathrm{d}\omega)\,\kappa(\omega)$ using (3.11) of Lemma 26. The claim now follows from Gronwall’s lemma [EK86, Thm A.5.1].

3.3 The stochastic representation

In this section, we prove the following proposition, that settles the existence part of Theorem 1. Together with Lemma 28, this completes the proof of Theorem 1 and at the same time also proves Theorem 6.

We work in our usual set-up where $S$ and $\Omega$ are Polish spaces, $\kappa:\Omega\to{\mathbb{N}}$ is measurable, $\gamma$ is as in Subsection 1.1, and ${\mathbf{r}}$ is a nonzero finite measure on $\Omega$ satisfying (1.4). We fix ${\mathbb{T}}$ as in Section 1.4 and let $(\bm{\omega}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ be i.i.d. with common law $|{\mathbf{r}}|^{-1}{\mathbf{r}}$ . We let $(\sigma_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ be an independent i.i.d. collection of exponentially distributed random variables with mean $|{\mathbf{r}}|^{-1}$ and define ${\mathbb{S}}$ , ${\mathbb{S}}_{t}$ , $\nabla{\mathbb{S}}_{t}$ , and $G_{t}$ as in (1.43), (1.44), and (1.47).

Proposition 29 (Recursive tree representation)

For any $\mu_{0}\in{\cal P}(S)$ , setting

[TABLE]

defines a solution $(\mu_{t})_{t\geq 0}$ to the mean-field equation (1.2). Moreover, ${[0,\infty)}\ni t\mapsto\mu_{t}$ is continuous with respect to the total variation norm.

To prepare for the proof of Proposition 29, we need one lemma. Recall that $|\mathbf{i}|$ denotes the length of a word $\mathbf{i}$ , i.e., $|i_{1}\cdots i_{n}|:=n$ . Let

[TABLE]

Fix $\mu_{0}\in{\cal P}(S)$ and using notation as in (1.46), set

[TABLE]

and set $\mu_{t,(0)}:=\mu_{0}$ $(t\geq 0)$ . The following lemma is a “cut-off” version of Proposition 29.

Lemma 30 (Representation with cut-off)

The measures $(\mu_{t,(n)})_{t\geq 0}$ defined in (3.21) satisfy

[TABLE]

**Proof **Let $(Y_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ be i.i.d. with common law $\mu_{0}$ , independent of $(\bm{\omega}_{\mathbf{i}},\sigma_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ . Set

[TABLE]

and define $(X_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{S}}_{t,(n)}}$ inductively by

[TABLE]

Then $X_{\varnothing}=G_{t,(n)}\big{(}(X_{\mathbf{i}})_{\mathbf{i}\in\nabla{\mathbb{S}}_{t,(n)}}\big{)}$ and hence, in the same way as (1.49) is equivalent to (1.51),

[TABLE]

Conditioning on $\sigma_{\varnothing}$ and then also on $\bm{\omega}_{\varnothing}$ , we see that

[TABLE]

where we have used that $\bm{\omega}_{\varnothing}$ is independent of $\sigma_{\varnothing}$ with law $|{\mathbf{r}}|^{-1}{\mathbf{r}}$ . We see from this that

[TABLE]

where we have used (1.1) and the observation that conditional on $\sigma_{\varnothing}=s$ and $\bm{\omega}_{\varnothing}=\omega$ , the random variables $X_{1},\ldots,X_{\kappa(\omega)}$ are i.i.d. with common law $\mu_{t-s,(n-1)}$ .

Proof of Proposition 29 The condition (1.4) guarantees that $(\nabla{\mathbb{S}}_{t})_{t\geq 0}$ is a finite mean branching process; more precisely, by standard theory,

[TABLE]

Fix $\mu_{0}\in{\cal P}(S)$ and define $\mu_{t}$ and $\mu_{t,(n)}$ as in (3.19) and (3.22). Then the total variation norm distance between these measures can be bounded by

[TABLE]

which tends to zero as $n\to\infty$ since ${\mathbb{S}}_{t}$ is a.s. finite by (3.28). In fact, since ${\mathbb{P}}[{\mathbb{S}}_{s,(n)}\neq{\mathbb{S}}_{s}]\leq{\mathbb{P}}[{\mathbb{S}}_{t,(n)}\neq{\mathbb{S}}_{t}]$ for all $s\leq t$ , we have that

[TABLE]

Using this and the Lipschitz continuity of $T$ with respect to the total variation norm (Lemma 26), we can let $n\to\infty$ in (3.22) to obtain

[TABLE]

Since

[TABLE]

using the fact that the branching process $(\nabla{\mathbb{S}}_{t})_{t\geq 0}$ a.s. does not jump at deterministic times, we see that ${[0,\infty)}\ni t\mapsto\mu_{t}$ is continuous with respect to the total variation norm. Using this and (3.31), we see from Lemma 27 that $(\mu_{t})_{t\geq 0}$ solves the mean-field equation (1.2).

3.4 Continuity in the initial state

In this subsection, we prove Propositions 2 and 3.

Proof of Proposition 2 It follows from Theorem 6 and Lemma 26 that

[TABLE]

where ${\cal F}_{t}$ is the filtration defined in (1.48).

Proposition 3 follows from the following two lemmas.

Lemma 31 (Continuity of $T$ )

Under the condition (1.19), the operator $T$ in (1.1) is continuous w.r.t. the topology of weak convergence.

**Proof **If $\mu_{n}\in{\cal P}(S)$ converge weakly to a limit $\mu_{\infty}$ , then by Skorohod’s representation theorem there exists random variables $X^{n}$ with laws $\mu_{n}$ that converge a.s. to a limit $X^{\infty}$ with law $\mu_{\infty}$ . Let $\big{(}(X^{n}_{i})_{n\in{\mathbb{N}}\cup\{\infty\}}\Big{)}_{i\geq 1}$ be i.i.d. copies of such a sequence $(X^{n})_{n\in{\mathbb{N}}\cup\{\infty\}}$ and let $\bm{\omega}$ be an independent random variable with law $|{\mathbf{r}}|^{-1}{\mathbf{r}}$ . Then by (1.19),

[TABLE]

and hence $T(\mu_{n})$ converges weakly to $T(\mu_{\infty})$ by (1.1).

Lemma 32 (Continuity in the initial state)

Assume that the operator $T$ in (1.1) is continuous w.r.t. the topology of weak convergence. Then the same is true for the operators $T_{t}$ $(t\geq 0)$ defined in (1.6).

**Proof **We need to show that solutions of the mean-field equation (1.2) are continuous in their initial state, in the sense that if $(\mu^{k}_{t})_{t\geq 0}$ $(k\in{\mathbb{N}}\cup\{\infty\})$ are started in initial states such that $\mu^{k}_{0}\Rightarrow\mu^{\infty}_{0}$ , then $\mu^{k}_{t}\Rightarrow\mu^{\infty}_{t}$ for all $t\geq 0$ .

To see this, inductively define $\mu^{k}_{t,(n)}$ as in (3.22) with $\mu_{0}$ replaced by $\mu^{k}_{0}$ . Using the continuity of $T$ , by induction, we see that $\mu^{k}_{t,(n)}\Rightarrow\mu^{\infty}_{t,(n)}$ as $k\to\infty$ for all $n\geq 1$ and $t\geq 0$ . By (3.29), for each bounded continuous $\phi:S\to{\mathbb{R}}$ , the quantity $\langle\mu^{k}_{t,(n)},\phi\rangle$ converges to $\langle\mu^{k}_{t},\phi\rangle$ uniformly in $k\in{\mathbb{N}}\cup\{\infty\}$ , which allows us to conclude that $\langle\mu^{k}_{t},\phi\rangle\to\langle\mu^{\infty}_{t},\phi\rangle$ as $k\to\infty$ for all $t\geq 0$ .

4 Approximation by finite systems

4.1 Main line of the proof

In this section, we prove Theorem 5. The basic idea, which already goes back to [DFL86], is that in the mean-field limit, the genealogy of a site converges to a branching process, and sites are independent in the limit. More precisely, consider $n$ sites, sampled uniformly at random from $[N]$ . To find out what their states are at time $t$ , we follow the sites back until the last time when a random map is applied that has the potential to change the state of one of our sites. At this point, we stop following that given site but replace it by the sites that are relevant for the outcome of the map at the given site, and we continue in this way. When $N$ is large, the new sites that are added in each step are with high probability sites we have not been following before, so that in the limit we obtain a branching process with random maps attached to its branch points. Making this idea precise yields the following proposition, that will be proved in Subsection 4.2 below.

Proposition 33 (State at sampled sites)

For each $N\in{\mathbb{N}}_{+}$ let $(X^{(N)}(t))_{t\geq 0}$ be a process as in Theorem 5 started in a deterministic initial state $X^{(N)}(0)$ . Fix $t\geq 0$ and let $T_{t}$ be defined as in (1.6) but with the mean-field equation (1.2) replaced by (1.22). Fix $n\geq 1$ and let $I_{1},\ldots,I_{n}$ be i.i.d. uniformly distributed on $[N]$ and independent of $X^{N}(t)$ . Then

[TABLE]

where $\|\,\cdot\,\|$ denotes the total variation norm, and the convergence in (4.1) is uniform w.r.t. the initial state $X^{(N)}(0)$ .

Proposition 33 allows us to control the mean and variance of $\mu^{N}_{Nt}$ , which is enough to prove the convergence of $\mu^{N}_{Nt}$ to $\mu_{t}$ for fixed times $t$ . To boost this up to pathwise convergence, we use the following lemma, that will be proved in Subsection 4.3 below.

Lemma 34 (Tightness in total variation)

For each $N\in{\mathbb{N}}_{+}$ let $(X^{(N)}(t))_{t\geq 0}$ be a process as in Theorem 5 started in a deterministic initial state $X^{(N)}(0)$ , and let $\mu^{N}_{t}:=\mu\big{\{}X^{(N)}(t)\big{\}}$ denote the empirical measure of $X^{(N)}(t)$ . Then there exist random processes $(\tau^{N}_{t})_{t\geq 0}$ such that $\tau^{N}:{\mathbb{R}}\to{\mathbb{R}}$ is a.s. nondecreasing with $\tau^{N}_{0}=0$ and

(i)

$\displaystyle{\mathbb{P}}\big{[}\sup_{0\leq t\leq T}|\tau^{N}_{t}-t|\geq\varepsilon\big{]}\underset{{N}\to\infty}{\longrightarrow}0\quad(\varepsilon>0,\ T<\infty)$ , 2. (ii)

$\displaystyle\|\mu^{N}_{Nt}-\mu^{N}_{Ns}\|\leq L(\tau^{N}_{t}-\tau^{N}_{s})\quad(0\leq s\leq t)$ * a.s.,*

where $\|\,\cdot\,\|$ denotes the total variation norm and $\displaystyle L:=\int_{\Omega}\!{\mathbf{q}}(\mathrm{d}\omega)\,\lambda(\omega)$ .

In Subsection 4.4, we will derive Theorem 5 from Proposition 33, Lemma 34, and some abstract considerations.

4.2 The state at sampled sites

In this subsection we prove Proposition 33. We start with two preparatory lemmas.

Let $({\mathbf{X}}^{N}_{s,t})_{s\leq t}$ be the stochastic flow defined in (1.29), where we have made the dependence on $N$ explicit. Let $I$ be uniformly distributed on $[N]$ and independent of $({\mathbf{X}}^{N}_{s,t})_{s\leq t}$ . For each $t\geq 0$ , let $\tilde{M}^{N}_{t}:S^{[N]}\to S$ be defined as (note the factor $N$ rescaling the speed of time):

[TABLE]

Let $G_{t}:S^{\nabla{\mathbb{S}}_{t}}\to S$ be the random map defined in (1.47), where $\Omega,{\mathbf{r}},\gamma$ and $\kappa$ from Subsection 1.1 are defined in terms of the “ingredients” $\Omega^{\prime},{\mathbf{q}}$ , $\gamma_{i}[\omega]$ and $K_{i}(\omega)$ from Subsection 1.3, see the proof of Lemma 4 in Subsection 3.1. Let $(I_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ be i.i.d. uniformly distributed on $[N]$ and independent of $(\nabla{\mathbb{S}}_{t},G_{t})_{t\geq 0}$ . For each $t\geq 0$ , let $M^{N}_{t}:S^{[N]}\to S$ be defined as

[TABLE]

The following lemma says that for large $N$ , the map in (4.2) can be approximated by the map in (4.3).

Lemma 35 (Coupling of maps)

For each $t\geq 0$ , it is possible to couple the random maps $\tilde{M}^{N}_{t}$ and $M^{N}_{t}$ with $N\in{\mathbb{N}}_{+}$ in such a way that

[TABLE]

**Proof **The essence of the proof can be summarized as follows: since for large $N$ , sampling with or without replacement from $[N]$ is almost the same, the genealogy of a given site is approximately given by a branching process. In spite of this simple idea, the proof is quite long, mainly because we have to take care of a lot of definitions, such as the way $\Omega,{\mathbf{r}},\gamma$ and $\kappa$ are defined in terms of $\Omega^{\prime},{\mathbf{q}}$ , $\gamma_{i}[\omega]$ and $K_{i}(\omega)$ in the proof of Lemma 4.

We start by recalling that the random map $G_{t}$ from (1.47) can be seen as the concatenation of random maps assigned to the branch points of a branching process. We then embed this branching process in the set $[N]$ and prove that what we obtain is a good approximation for the genealogy of a given site.

We observe that in order to construct the map $G_{t}:S^{\nabla{\mathbb{S}}_{t}}\to S$ from (1.47), it suffices to know

[TABLE]

where ${\mathbb{S}}_{t}$ is defined in (1.44). Indeed, from the information in (4.5) we can determine $\nabla{\mathbb{S}}_{t}$ , since

[TABLE]

and the map $G_{t}:S^{\nabla{\mathbb{S}}_{t}}\to S$ is obtained by concatenating the maps $\gamma[\bm{\omega}_{\mathbf{i}}]$ with $\mathbf{i}\in{\mathbb{S}}_{t}$ according to the tree structure of ${\mathbb{S}}_{t}$ .

The object in (4.5) is in fact a Markov chain as a function of $t$ . Starting from the initial state ${\mathbb{S}}_{0}=\emptyset$ and $\nabla{\mathbb{S}}_{0}=\{\varnothing\}$ , its evolution is as follows: Independently for each $\mathbf{i}\in\nabla{\mathbb{S}}_{t}$ , with rate $|{\mathbf{r}}|$ , we add $\mathbf{i}$ to ${\mathbb{S}}_{t}$ and assign to it a value $\bm{\omega}_{\mathbf{i}}$ chosen according to the probability law $|{\mathbf{r}}|^{-1}{\mathbf{r}}$ .

We will be interested in the process in (4.5) in the special case when $\Omega,{\mathbf{r}},\kappa$ , and $\gamma$ are defined in terms of $\Omega^{\prime},{\mathbf{q}},\lambda$ , $K_{i}(\omega)$ , and $\gamma_{i}[\omega]$ as in the proof of Lemma 4. In this case, elements of $\Omega$ are pairs $(\omega,n)$ where $\omega\in\Omega^{\prime}$ and $1\leq n\leq\lambda(\omega)$ , so we denote the process in (4.5) as

[TABLE]

where $\bm{\omega}_{\mathbf{i}}\in\Omega^{\prime}$ and $1\leq n_{\mathbf{i}}\leq\lambda(\bm{\omega}_{\mathbf{i}})$ . The set $\nabla{\mathbb{S}}_{t}$ is now given by

[TABLE]

Defining ${\mathbf{r}}$ as in (3.3), the process in (4.5) now evolves in such a way that independently for each $\mathbf{i}\in\nabla{\mathbb{S}}_{t}$ , with rate $|{\mathbf{r}}|$ , we add $\mathbf{i}$ to ${\mathbb{S}}_{t}$ and assign values $(\bm{\omega}_{\mathbf{i}},n_{\mathbf{i}})$ to it that are chosen according to the probability law $|{\mathbf{r}}|^{-1}{\mathbf{r}}$ .

Let $\alpha\in[N]$ be fixed. Our next aim is to “embed” the process from (4.7) in the set $[N]$ , in such a way that it approximates the genealogy of the site $\alpha$ . To this aim, we define, for each time, a random function $\psi^{N}_{t}:{\mathbb{S}}_{t}\cup\nabla{\mathbb{S}}_{t}\to[N]$ . Initially, we set $\psi^{N}_{0}(\varnothing):=\alpha$ . We let the function $\psi^{N}_{t}$ evolve in a Markovian way together with the process in (4.7) in the following way. Recall that when we add an element $\mathbf{i}$ to ${\mathbb{S}}_{t}$ and assign values $(\bm{\omega}_{\mathbf{i}},n_{\mathbf{i}})$ to it, this element is at the same time removed from $\nabla{\mathbb{S}}_{t}$ and replaced by new elements $\mathbf{i}1,\ldots,\mathbf{i}\kappa_{n_{\mathbf{i}}}(\bm{\omega}_{\mathbf{i}})$ . We assign labels $\psi^{N}_{t}(\mathbf{i}k)$ $(k=1,\ldots,\kappa_{n_{\mathbf{i}}}(\bm{\omega}_{\mathbf{i}}))$ to these new elements as follows. First, we choose $(I_{l})_{l=1,\ldots,\lambda(\bm{\omega}_{\mathbf{i}})}$ in such a way that $I_{n_{\mathbf{i}}}:=\psi^{N}_{t}(\mathbf{i})$ and

[TABLE]

and next, we set $\psi^{N}_{t}(\mathbf{i}k):=I_{j_{k}}$ , where as in (3.4), we order the elements of $K_{n_{\mathbf{i}}}(\bm{\omega}_{\mathbf{i}})\subset\{1,\ldots,\lambda(\bm{\omega}_{\mathbf{i}})\}$ as

[TABLE]

Note that this has the effect that if $n_{\mathbf{i}}$ is an element of $K_{n_{\mathbf{i}}}(\bm{\omega}_{\mathbf{i}})$ , say $n_{\mathbf{i}}=j_{k}$ , then the corresponding element $\mathbf{i}k$ gets the same label as $\mathbf{i}$ , i.e., $\psi^{N}_{t}(\mathbf{i}k)=\psi^{N}_{t}(\mathbf{i})$ . Otherwise, we assign new i.i.d. labels to all new elements of $\nabla{\mathbb{S}}_{t}$ .

Using the function $\psi^{N}_{t}$ that embeds the process in (4.7) in the set $[N]$ , we define a function $\phi^{N}_{t}:S^{N}\to S^{\nabla{\mathbb{S}}_{t}}$ by

[TABLE]

We now consider the maps

[TABLE]

where $\alpha=\psi^{N}_{0}(\varnothing)\in[N]$ is the label initially assigned to the root. We claim that

[TABLE]

where $\|\,\cdot\,\|$ denotes the total variation norm. In particular, if $\alpha$ is chosen uniformly distributed in $[N]$ and independent of everything else, then $(\psi^{N}_{t}(\mathbf{i}))_{\mathbf{i}\in\nabla{\mathbb{S}}_{t}}$ are i.i.d. uniformly distributed in $[N]$ and independent of the map $G_{t}$ , so (4.13) implies (4.4).

To prove (4.13), we construct a process similar to the process in (4.7), together with an embedding in $[N]$ , that describes the true genalogy of the site $\alpha$ , and show that the error we make by replacing this true genealogy by the process we had before is small. We denote this process as

[TABLE]

At each time, $\nabla\tilde{\mathbb{S}}_{t}$ is defined in terms of this process in the same way as $\nabla{\mathbb{S}}_{t}$ is defined in (4.8). We also define $\tilde{G}_{t}$ and $\tilde{\phi}^{N}_{t}:S^{N}\to S^{\nabla\tilde{\mathbb{S}}_{t}}$ as before, i.e., $\tilde{G}_{t}$ is the concatenation of the random maps $\gamma_{n_{\mathbf{i}}}[\omega_{\mathbf{i}}]$ with $\mathbf{i}\in\tilde{\mathbb{S}}_{t}$ according to the tree structure of $\tilde{\mathbb{S}}_{t}$ , and $\tilde{\phi}^{N}_{t}:S^{N}\to S^{\nabla\tilde{\mathbb{S}}_{t}}$ is defined in terms of $\tilde{\psi}^{N}_{t}$ as in (4.11).

Recall that $\tilde{M}^{N}_{t}(x)={\mathbf{X}}^{N}_{-Nt,0}(x)_{\alpha}$ . As for our previous process we start with $\tilde{\mathbb{S}}_{0}=\emptyset$ , $\nabla\tilde{\mathbb{S}}_{0}=\{\varnothing\}$ , and $\tilde{\psi}^{N}(\varnothing)=\alpha$ . In Subsection 1.3, the stochastic flow $({\mathbf{X}}^{N}_{s,t})_{s\leq t}$ is constructed from a Poisson point set $\Pi$ . We will construct the process in (4.14) in terms of $\Pi$ in such a way that

[TABLE]

which expresses the fact that the process in (4.14) describes the “true genealogy” of the site $\alpha$ .

The Poisson set $\Pi$ consists of triples $(\omega,\mathbf{i},t)$ which express the fact that at time $t$ the random map $\vec{\gamma}[\omega]$ should be applied to the coordinates $\mathbf{i}=(i_{1},\ldots,i_{\lambda(\omega)})$ . Note that we are interested in ${\mathbf{X}}^{N}_{-Nt,0}(x)_{\alpha}$ , which means that we look at negative times and need to rescale time by a factor $N$ . For each $(\omega,\mathbf{i},-Nt)\in\Pi$ and $\mathbf{j}\in\nabla\tilde{\mathbb{S}}_{t}$ such that $\tilde{\psi}^{N}_{t}(\mathbf{j})=i_{l}$ for some $1\leq l\leq\lambda(\omega)$ , we update the process in (4.14) as follows:

(i)

We remove $\mathbf{j}$ from $\nabla\tilde{\mathbb{S}}_{t}$ and add it to $\tilde{\mathbb{S}}_{t}$ . 2. (ii)

We set $\omega_{\mathbf{j}}:=\omega$ and $n_{\mathbf{j}}:=l$ . 3. (iii)

We add $\mathbf{j}1,\ldots,\mathbf{j}\kappa_{l}(\omega)$ to $\tilde{\mathbb{S}}_{t}$ . 4. (iv)

We define $\tilde{\psi}^{N}_{t}(\mathbf{j}k):=i_{j_{k}}$ $(k=1,\ldots,\kappa_{l}(\omega))$ , where $K_{l}(\omega)=\{j_{1},\ldots,j_{\kappa_{l}(\omega)}\}$ as in (3.4).

It is straightforward to check that these rules guarantee that (4.15) holds and hence the process in (4.14) describes the true genealogy of the site $\alpha$ . As some more explanation, we can add the following: we follow a site $\beta$ back in time till the first time when a map is applied that has the possibility to change the value of $\beta$ . From that moment on, we follow back all sites that are relevant for the outcome of the map at $\beta$ , and we number them according to the convention in (3.4). This defines a family structure, i.e., $\mathbf{i}=i_{1}i_{2}i_{3}$ is the $i_{3}$ -th child of the $i_{2}$ -th child of the $i_{1}$ -th child of the original site $\alpha$ . The map $\tilde{\psi}^{N}_{t}$ applied to $\mathbf{i}$ tells us where this ancestor lives in the set $[N]$ . There may be some overlap, i.e., it is possible that $\tilde{\psi}^{N}_{t}(\mathbf{i})=\tilde{\psi}^{N}_{t}(\mathbf{j})$ for some $\mathbf{i},\mathbf{j}\in\tilde{\mathbb{S}}_{t}\cup\nabla\tilde{\mathbb{S}}_{t}$ . For $\mathbf{i},\mathbf{j}\in\nabla\tilde{\mathbb{S}}_{t}$ , however, the probability that two ancestors live at the same site in $[N]$ tends to zero as $N\to\infty$ , as we will see in a moment.

In view of (4.15), to prove (4.13), it suffices to prove that the Markov process in (4.14) is close in total variation distance to the process with $\tilde{\mathbb{S}}_{t}$ and $\tilde{\psi}^{N}_{t}$ replaced by ${\mathbb{S}}_{t}$ and $\psi^{N}_{t}$ . Since the latter process is nonexplosive by (3.28), it suffices to prove convergence for the processes stopped at the first time when the cardinality of $\tilde{\nabla}{\mathbb{S}}_{t}$ resp. $\nabla{\mathbb{S}}_{t}$ exceeds a certain value, and then at the end send this value to infinity. We will prove convergence of the stopped processes in a number of steps, by making small changes in the jump rates. Here we use the fact that if the transition kernels of two continuous-time Markov chains are close in total variation norm, uniformly in the starting point, then by standard arguments the two processes can be coupled so that their laws at fixed time are close in total variation norm.

Let $\tilde{\Psi}^{N}_{t}:=\tilde{\psi}^{N}_{t}(\nabla\tilde{\mathbb{S}}_{t})$ denote the image of $\nabla\tilde{\mathbb{S}}_{t}$ under the map $\tilde{\psi}^{N}_{t}$ . As a first step, we change the dynamics of the (stopped) process from (4.14) in such a way that elements $(\omega,\mathbf{i},-Nt)\in\Pi$ have no effect if $\{i_{1},\ldots,i_{\lambda(\omega)}\}$ intersects $\tilde{\Psi}^{N}_{t}$ in more that one point. Then the modified process is still Markovian; we claim the change in jump rates compared to the original process is of order $N^{-1}$ . Indeed, for fixed $l$ , if $i_{1},\ldots,i_{l}$ are chosen uniformly without replacement from $[N]$ , then the probability that one, resp. two or more of them lie in a set $A$ of fixed cardinality is of order $N^{-1}$ resp. $N^{-2}$ as $N\to\infty$ . Taking into account the fact that we rescale time by a factor $N$ , as well as the summability condition (1.23) (i), this translates into a change in jump rates of order $N^{-1}$ for the modified process, stopped at the first time when the cardinality of $\nabla\tilde{\mathbb{S}}_{t}$ exceeds a fixed value.

Recall that by (4.15), ${\mathbf{X}}^{N}_{-Nt,0}(x)_{\alpha}$ is a function only of $(x_{\beta})_{\beta\in\tilde{\Psi}^{N}_{t}}$ . The modified process we have just constructed has the property that $\tilde{\psi}^{N}_{t}:\nabla\tilde{\mathbb{S}}_{t}\to\tilde{\Psi}^{N}_{t}$ is a bijection, i.e., each element $\beta\in\tilde{\Psi}^{N}_{t}$ corresponds only to a single place $(\psi^{N}_{t})^{-1}(\beta)$ in the family tree. The dynamics of the modified process can be described as follows:

(i)

Independently for each $\beta\in\tilde{\Psi}^{N}_{t}$ , with rates described by the measure ${\mathbf{r}}$ from (3.3), we choose a pair $(\omega,n)$ with $1\leq n\leq\lambda(\omega)$ . 2. (ii)

If $\lambda(\omega)>N$ , we do nothing. 3. (iii)

Otherwise, we choose $(\beta^{\prime}_{k})_{k=1,\ldots,\lambda(\omega)}$ such that $\beta^{\prime}_{n}:=\beta$ and $(\beta^{\prime}_{k})_{k\neq n}$ are drawn from $[N]\backslash\{\beta\}$ without replacement. 4. (iv)

If some of the $(\beta^{\prime}_{k})_{k\neq n}$ are elements of $\tilde{\Psi}^{N}_{t}$ , we do nothing. 5. (v)

Otherwise, we remove $\beta$ from $\tilde{\Psi}^{N}_{t}$ and add $(\beta_{j_{k}})_{1\leq k\leq\kappa_{n}(\omega)}$ to $\tilde{\Psi}^{N}_{t}$ , where $K_{n}(\omega)=\{j_{1},\ldots,j_{\kappa_{n}(\omega)}\}$ with $j_{1}<\cdots<j_{\kappa_{n}(\omega)}$ . 6. (vi)

If $\mathbf{j}=(\tilde{\psi}^{N}_{t-})^{-1}(\beta)$ is the place of $\beta$ in the family tree immediately prior to time $t$ , then we assign to each new element of $\tilde{\Psi}^{N}_{t}$ a place in the family tree by setting $(\tilde{\psi}^{N}_{t-})^{-1}(\beta_{j_{k}}):=\mathbf{j}k$ .

Note that the measure ${\mathbf{r}}$ from (3.3) occurs naturally here, since each $\lambda(\omega)$ -tuple of sites in $[N]$ can contain a given site $\beta$ in $\lambda(\omega)$ different ways, as its 1st, 2nd,…, $\lambda(\omega)$ -th member.

Removing the restrictions in points (ii) and (iv) above, and performing sampling without replacement instead of sampling with replacement in point (iii), we only make changes in the transition rates of order $N^{-1}$ , and arrive at a process whose family tree evolves as the process in (4.7) and where to new members of the family tree, sites in $[N]$ are assigned chosen uniformly with replacement, as described by the process $\psi^{N}_{t}$ .

In the proof of Lemma 35, we have seen that in the mean-field limit $N\to\infty$ , the genealogy of a single site can be approximated by a branching process with random maps attached to its branch points. Similarly, the genealogy of $n$ randomly chosen sites can be approximated by $n$ independent branching processes, which leads to the following extension of Lemma 35.

Lemma 36 (The genealogy of multiple sites)

Let $({\mathbf{X}}^{N}_{s,t})_{s\leq t}$ be the stochastic flow defined in (1.29) and let $I_{1},\ldots,I_{n}$ be i.i.d. uniformly distributed on $[N]$ , independent of $({\mathbf{X}}^{N}_{s,t})_{s\leq t}$ . Let $\Omega,{\mathbf{r}},\gamma$ and $\kappa$ be defined in terms of $\Omega^{\prime},{\mathbf{q}}$ , $\gamma_{i}[\omega]$ and $K_{i}(\omega)$ as in the proof of Lemma 4. Fix $t\geq 0$ and let $(\nabla{\mathbb{S}}^{i}_{t},G^{i}_{t})$ $(i=1,\ldots,n)$ be i.i.d. copies of the random set and map defined in (1.44) and (1.47). Conditional on $(\nabla{\mathbb{S}}^{i}_{t},G^{i}_{t})_{i=1,\ldots,n}$ , let $(I^{i}_{\mathbf{j}})^{i=1,\ldots,n}_{\mathbf{j}\in\nabla{\mathbb{S}}^{i}_{t}}$ be i.i.d. uniformly distributed on $[N]$ . Define $\tilde{M}^{N}_{t}:S^{N}\to S^{n}$ and $M^{N}_{t}:S^{N}\to S^{n}$ by

[TABLE]

Then $\tilde{M}^{N}_{t}$ and $M^{N}_{t}$ can be coupled such that $\displaystyle{\mathbb{P}}\big{[}\tilde{M}^{N}_{t}\neq M^{N}_{t}\big{]}\underset{{N}\to\infty}{\longrightarrow}0$ .

**Proof **The proof is the same as the proof of Lemma 35, except that instead of following back the genealogy of one site, one follows the genealogies of $n$ sites. By the same arguments as given in the proof of Lemma 35, when $N$ is large, with high probability, the genealogies do not intersect, and hence can be approximated by independent branching processes. Although writing down all objects involved is notationally complicated, no new ideas are needed so we omit the details.

Proof of Proposition 33 Let $x:=X^{(N)}(0)$ be the (deterministic) initial state and using notation as in (1.33) let $\mu^{N}_{0}=\mu\{x\}$ denote its empirical measure. Define maps $\tilde{M}^{N}_{t}$ and $M^{N}_{t}$ as in Lemma 36. Then $\big{(}X^{(N)}_{I_{1}}(Nt),\ldots,X^{(N)}_{I_{n}}(Nt)\big{)}$ has law $\tilde{M}^{N}_{t}(x)$ while the coordinates of $M^{N}_{t}(x)$ are i.i.d. with a law that by Theorem 6 equals $T_{t}(\mu^{N}_{0})$ . In view of this, the claim follows from Lemma 36.

4.3 Tightness in total variation

In this subsection we prove Lemma 34.

Proof of Lemma 34 The process $(X^{(N)}(t))_{t\geq 0}$ is defined in (1.32) in terms of a stochastic flow which is in turn defined in terms of a Poisson set $\Pi$ . Elements of $\Pi$ are triples $(\omega,\mathbf{i},s)$ which tell us that at time $s$ the map $\vec{\gamma}[\omega]$ should be applied to the coordinates $\mathbf{i}=(i_{1},\ldots,i_{\lambda(\omega)})$ . We let

[TABLE]

where $L:=\int_{\Omega}\!{\mathbf{q}}(\mathrm{d}\omega)\,\lambda(\omega)$ , which is finite by (1.23). Then (i) follows from a functional law of large numbers. Since for any $s\leq t$ , the fraction of sites in $[N]$ that changes its type is bounded from above by $L(\tau^{N}_{t}-\tau^{N}_{s})$ , in view of (3.9), we obtain also (ii).

4.4 Convergence to the mean-field equation

In this subsection, we prove Theorem 5. The proof is split into a number of lemmas. We start by proving convergence at fixed times. This part of the proof is based on Proposition 33. At the end of the proof, we use Lemma 34 to obtain pathwise convergence.

Lemma 37 (Expectation of test functions)

Let $\Omega^{\prime},{\mathbf{q}},\lambda$ , and $\vec{\gamma}$ be as in Subsection 1.3, and assume (1.23). Let $(T_{t})_{t\geq 0}$ denote the semigroup defined as in (1.6) but with the mean-field equation (1.2) replaced by (1.22). For each $N\in{\mathbb{N}}_{+}$ , let $(X^{(N)}(t))_{t\geq 0}$ be Markov processes with state space $S^{N}$ as defined in (1.32), and let $\mu^{N}_{t}=\mu\{X^{(N)}(t)\}$ denote their associated empirical measures. Then

[TABLE]

where the supremum runs over all measurable functions $\phi:S\to[-1,1]$ .

**Proof **Fix $t\geq 0$ . Let $\phi:S\to[-1,1]$ be measurable. Let $I_{1}$ and $I_{2}$ be uniformly distributed on $[N]$ and independent of each other and of $X^{N}(t)$ . Since

[TABLE]

we see that

[TABLE]

Assume for the moment that $X^{(N)}(0)$ is deterministic. Then applying Proposition 33 with $n=1,2$ we find that

[TABLE]

where we take the supremum over all measurable $\phi:S\to[-1,1]$ . It follows that

[TABLE]

and hence (4.18) follows by Chebyshev’s inequality. To obtain (4.18) more generally when $X^{(N)}(0)$ is random, we condition on the initial state to get, for each $\varepsilon>0$ and measurable $\psi:S\to[-1,1]$ .

[TABLE]

Since the integrand on the right-hand side does not depend on $\psi$ and tends to zero in a bounded pointwise way as a function of $x\in S^{N}$ , (4.18) follows.

Our next aim is to prove that if in addition to the assumptions of Lemma 37, condition (i) or (ii) of Theorem 5 is satisfied, then

[TABLE]

where $d$ is any metric on ${\cal P}(S)$ that generates the topology of weak convergence. Applying the following well-known fact to the Polish space ${\cal P}(S)$ , we see that if (4.24) holds for one such metric, then it holds for all of them.

Lemma 38 (Convergence in probability)

Let $X_{n}$ be random variables taking values in a Polish space $S$ , let $x\in S$ be deterministic, and let $d$ be a metric generating the topology on $S$ . Then one has

[TABLE]

if and only if

[TABLE]

where $\Rightarrow$ denotes weak convergence of probability measures on $S$ .

**Proof **It is easy to see that (4.25) implies ${\mathbb{E}}[\phi(X_{n})]\to\phi(x)$ for all bounded continuous $\phi:S\to{\mathbb{R}}$ , so (4.25) implies (4.26). Conversely, if (4.26) holds, then by Skorohod’s representation theorem it is possible to couple the random variables $X_{n}$ such that $X_{n}\to x$ a.s., which implies (4.25).

The following lemma gives sufficient conditions for the type of convergence of (4.24).

Lemma 39 (Convergence to a deterministic measure)

Let $S$ be a Polish space, let $\mu\in{\cal P}(S)$ be deterministic, and let $\mu^{N}$ be random variables with values in ${\cal P}(S)$ . Let $d$ be a metric on ${\cal P}(S)$ generating the topology of weak convergence. Then the following conditions are equivalent.

(i)

$\displaystyle{\mathbb{P}}\big{[}d(\mu^{N},\mu)\geq\varepsilon\big{]}\underset{{N}\to\infty}{\longrightarrow}0$ * for all $\varepsilon>0$ .* 2. (ii)

$\displaystyle{\mathbb{P}}\big{[}\big{|}\langle\mu^{N},\phi\rangle-\langle\mu,\phi\rangle\big{|}\geq\varepsilon\big{]}\underset{{N}\to\infty}{\longrightarrow}0$ * for all $\varepsilon>0$ and bounded continuous $\phi:S\to{\mathbb{R}}$ .* 3. (iii)

$\displaystyle{\mathbb{E}}\big{[}\prod_{i=1}^{n}\langle\mu^{N},\phi_{i}\rangle\big{]}\underset{{N}\to\infty}{\longrightarrow}\prod_{i=1}^{n}\langle\mu,\phi_{i}\rangle$ * for all bounded continuous functions $\phi_{1},\ldots,\phi_{n}$ $(n\geq 1)$ .*

**Proof **We equip ${\cal P}(S)$ with the topology of weak convergence, making it into a Polish space. Then by Lemma 38, condition (i) is equivalent to

(i)’

$\displaystyle{\mathbb{P}}[\mu^{N}\in\,\cdot\,]\underset{{N}\to\infty}{\Longrightarrow}\delta_{\mu}$ .

We will prove (i)’ $\Rightarrow$ (ii) $\Rightarrow$ (iii) $\Rightarrow$ (i)’.

(i)’ $\Rightarrow$ (ii). By Skorohod’s representation theorem, (i)’ implies that the $\mu^{N}$ can be coupled such that $\mu^{N}\underset{{N}\to\infty}{\Longrightarrow}\mu$ a.s., which implies (ii).

(ii) $\Rightarrow$ (iii). Without loss of generality we may assume that the $\phi_{i}$ ’s take values in $[-1,1]$ . Since the function $(x_{1},\ldots,x_{n})\mapsto\prod_{i=1}^{n}x_{i}$ is continuous, (ii) implies that

[TABLE]

for all $\varepsilon>0$ and bounded continuous functions $\phi_{1},\ldots,\phi_{n}$ . Since moreover $|\prod_{i=1}^{n}\langle\mu^{N},\phi_{i}\rangle|\leq 1$ , this implies (iii).

(iii) $\Rightarrow$ (i)’. Since $S$ is Polish, it has a metrizable compactification, i.e., there exists a compact metrizable space $\overline{S}$ such that $S$ is a dense subset of $\overline{S}$ and the topology on $S$ is the induced topology from $\overline{S}$ [Cho69, Theorem 6.3]. It is known that this implies that $S$ is a $G_{\delta}$ -subset of $\overline{S}$ [Bou58, §6 No. 1, Theorem. 1]. In particular, $S$ is a Borel measurable subset of $\overline{S}$ and we can identify ${\cal P}(S)$ with the space of probability measures on $\overline{S}$ that are concentrated on $S$ . If we equip ${\cal P}(\overline{S})$ with the topology of weak convergence, then the induced topology on ${\cal P}(S)$ is also the topology of weak convergence (this follows, e.g., from [EK86, Thm 3.3.1]), and in fact ${\cal P}(\overline{S})$ (being compact by Prohorov’s theorem) is a metrizable compactification of ${\cal P}(S)$ .

We view $\mu^{N}$ and $\mu$ as probability measures on $\overline{S}$ . Since $\overline{S}$ is compact, so are ${\cal P}(\overline{S})$ and ${\cal P}({\cal P}(\overline{S}))$ , so by going to a subsequence if necessary, we can assume that the laws ${\mathbb{P}}[\mu^{N}\in\,\cdot\,]$ converge weakly to some limit $\rho\in{\cal P}({\cal P}(\overline{S}))$ . Since the restriction to $S$ of a continuous function $\phi:\overline{S}\to{\mathbb{R}}$ is a bounded continuous function on $S$ , condition (ii) implies that

[TABLE]

for general $n\geq 1$ and continuous functions $\phi_{i}:\overline{S}\to{\mathbb{R}}$ $(i=1,\ldots,n)$ . By the Stone-Weierstrass theorem, the linear span of functions of the form $\nu\mapsto\prod_{i=1}^{n}\langle\mu,\phi_{i}\rangle$ is dense in the space of continuous functions on ${\cal P}(\overline{S})$ , and hence (4.28) implies $\rho=\delta_{\mu}$ .

We now prove (4.24) under either of the conditions (i) and (ii) of Theorem 5.

Lemma 40 (Continuity argument)

In addition to the assumptions of Lemma 37, assume that condition (i) of Theorem 5 is satisfied. Then (4.24) holds.

**Proof **Fix $t\geq 0$ . In view of Lemma 39 (ii), it suffices to show that

[TABLE]

for any bounded continuous $\phi:S\to{\mathbb{R}}$ . By Lemma 37, it suffices to show that

[TABLE]

By the second part of condition (i), Lemma 4, and Proposition 3, the operator $T_{t}$ is continuous w.r.t. weak convergence. In view of this, (4.30) is implied by the first part of condition (i).

Lemma 41 (Moment argument)

In addition to the assumptions of Lemma 37, assume that condition (ii) of Theorem 5 is satisfied. Then (4.24) holds.

**Proof **Fix $t\geq 0$ . In view of Lemma 39 (iii), it suffices to show that

[TABLE]

for all $n\geq 1$ and bounded continuous functions $\phi_{i}:S\to{\mathbb{R}}$ , $i=1,\ldots,n$ . Without loss of generality we may assume that the $\phi_{i}$ ’s take values in $[-1,1]$ . Let $X^{(N)}(t)$ be as in Theorem 5 and let $I_{1},\ldots,I_{n}$ be i.i.d. uniformly distributed on $[N]$ and independent of $X^{N}(t)$ . Then

[TABLE]

By Proposition 33 applied to the process conditioned on $X^{(N)}(0)$ , there exist $\varepsilon_{N}\to 0$ such that

[TABLE]

In view of (3.8), it follows that

[TABLE]

Combining this with (4.32), taking the expectation, we obtain that

[TABLE]

In view of this, to prove (4.31), it suffices to show that

[TABLE]

If $\mu\in{\cal P}(S)$ is deterministic, then Theorem 6 tells us that

[TABLE]

where $\nabla{\mathbb{S}}_{t}$ and $G_{t}$ are as in (1.44) and (1.47) and $(X_{\mathbf{j}})_{\mathbf{j}\in{\mathbb{T}}}$ are i.i.d. with law $\mu$ . Conditional on $\mu^{N}_{0}$ , let $(X^{i}_{\mathbf{j}})^{i=1,\ldots,n}_{\mathbf{j}\in{\mathbb{T}}}$ be i.i.d. with common law $\mu^{N}_{0}$ . Let $(\nabla{\mathbb{S}}^{i}_{t},G^{i}_{t})$ $(i=1,\ldots,n)$ be i.i.d. and distributed as the random variables in (1.44) and (1.47), independent of $\mu^{N}_{0}$ and $(X^{i}_{\mathbf{j}})^{i=1,\ldots,n}_{\mathbf{j}\in{\mathbb{T}}}$ . Then (4.37) implies that

[TABLE]

If we replace the expectation on the right-hand side by a conditional expectation given $(\nabla{\mathbb{S}}^{i}_{t},G^{i}_{t})_{i=1,\ldots,n}$ , then this is the integral of a measurable $[-1,1]$ -valued function with respect to the expectation of a product measure of the form $(\mu^{N}_{0})^{\otimes m}$ , where $m=\sum_{i=1}^{n}|\nabla{\mathbb{S}}^{i}_{t}|$ . Condition (ii) of Theorem 5 allows us to replace the integral w.r.t. ${\mathbb{E}}[(\mu^{N}_{0})^{\otimes m}]$ by the integral w.r.t. $\mu_{0}^{\otimes m}$ at the cost of a small error. Thus,

[TABLE]

where the $(\tilde{X}^{i}_{\mathbf{j}})^{i=1,\ldots,n}_{\mathbf{j}\in{\mathbb{T}}}$ are i.i.d. with common law $\mu_{0}$ and independent of $(\nabla{\mathbb{S}}^{i}_{t},G^{i}_{t})_{i=1,\ldots,n}$ , and $R^{N}$ is a random error term that by condition (ii) can be estimated as

[TABLE]

where $\lim_{N\to\infty}\varepsilon^{N}_{m}=0$ for each $m$ . Note that moreover $|R^{N}|\leq 2$ since the $\phi_{i}$ ’s take values in $[-1,1]$ . Integrating over the randomness of $(\nabla{\mathbb{S}}^{i}_{t},G^{i}_{t})_{i=1,\ldots,n}$ , using bounded convergence, (4.37) and (4.38), (4.36) follows.

With Lemmas 40 and 41 proved, most of the work needed for proving Theorem 5 is done. The only remaining task is to improve the convergence at fixed times in (4.24) to pathwise convergence as in (1.34). Our first aim is to show that the condition (1.34) does not depend on the choice of the metric $d$ . This follows from the following lemma, applied to the Polish space ${\cal P}(S)$ .

Lemma 42 (Convergence in path space)

Let $S$ be a Polish space and let $d$ be a metric generating the topology on $S$ . Let ${\cal D}_{S}{[0,\infty)}$ be the space of cadlag functions $x:{[0,\infty)}\to S$ , equipped with the Skorohod topology. Let $X_{n}=(X_{n}(t))_{t\geq 0}$ be random variables with values in ${\cal D}_{S}{[0,\infty)}$ and let $x:{[0,\infty)}\to S$ be a continuous function. Then one has

[TABLE]

if and only if

[TABLE]

where $\Rightarrow$ denotes weak convergence of probability measures on ${\cal D}_{S}{[0,\infty)}$ .

**Proof **It is well-known that ${\cal D}_{S}{[0,\infty)}$ is a Polish space [EK86, Sect. 3.5]. Let $d_{\rm S}$ be the metric generating the topology on ${\cal D}_{S}{[0,\infty)}$ defined in [EK86, (5.2) of Chapter 3]. Then it is easy to see that for all $\delta>0$ there exist $\varepsilon>0$ and $T<\infty$ such that

[TABLE]

In view of this, (4.41) implies

[TABLE]

which by Lemma 38 implies (4.42). Conversely, if (4.42) holds, then by Skorohod’s representation theorem it is possible to couple the random variables $X_{n}$ such that $d_{\rm S}(X_{n},x)\to 0$ a.s. By the continuity of $x$ and [EK86, Lemma 3.10.1], this implies that

[TABLE]

which implies (4.41).

Before the proof of Theorem 5 we need one more lemma.

Lemma 43 (Weak convergence and convergence in total variation norm)

Let $S$ be a Polish space. Then there exists a metric $d$ on ${\cal P}(S)$ such that $d$ generates the topology of weak convergence and $d(\mu,\nu)\leq\|\mu-\nu\|$ $(\mu,\nu\in{\cal P}(S))$ , where $\|\,\cdot\,\|$ denotes the total variation norm.

**Proof **Let $r$ be a metric generating the topology on $S$ . Replacing $r(x,y)$ by $r(x,y)\wedge 1$ if necessary we can assume without loss of generality that $r\leq 1$ . Let ${\cal L}$ be the space of all functions $\phi:S\to{\mathbb{R}}$ such that $|\phi(x)-\phi(y)|\leq r(x,y)$ $(x,y\in S)$ , i.e., these are Lipschitz continuous functions with Lipschitz constant $\leq 1$ . Then

[TABLE]

is the 1-Wasserstein metric on ${\cal P}(S)$ , which is known to generate the topology of weak convergence. Let ${\cal L}^{\prime}:=\{\phi\in{\cal L}:\sup_{x\in S}|\phi(x)|\leq 1\}$ . Since $r\leq 1$ , each function $\phi\in{\cal L}$ can be written as $\phi={\textstyle\frac{{1}}{{2}}}\phi^{\prime}+c$ with $\phi^{\prime}\in{\cal L}^{\prime}$ and $c\in{\mathbb{R}}$ . In view of this and (3.8),

[TABLE]

Proof of Theorem 5 Lemmas 40 and 41 show that either of the conditions (i) and (ii) implies (4.24). We will use Lemma 34 to improve (4.24) to pathwise convergence as in (1.34). By Lemma 42 it suffices to prove (1.34) for one particular metric $d$ on ${\cal P}(S)$ that generates the topology of weak convergence. We will choose a metric $d$ as in Lemma 43.

Set $\mu_{t}:=T_{t}(\mu_{0})$ $(t\geq 0)$ denote the solution to the mean-field equation (1.22) with initial state $\mu_{0}$ . Lemma 34 implies that

[TABLE]

Taking the limit $N\to\infty$ , using the fact that $d(\mu,\nu)\leq\|\mu-\nu\|$ and (4.24), it follows that

[TABLE]

Since for any $s,t\geq 0$ ,

[TABLE]

using Lemma 34, (4.24), and (4.49), we see that for each $T>0$ and $t\in[0,T]$ ,

[TABLE]

Combining this with the fact that by (4.24), for any $n\geq 1$ ,

[TABLE]

we find that

[TABLE]

Since $\varepsilon$ and $n$ are arbitrary, this implies (1.34).

5 Recursive Tree Processes

In this section, we prove our main results about RTPs with continuous time. For completeness, we also prove Lemma 8 which deals with discrete time and says that each solution to the RDE (1.54) gives rise to an RTP. This is done in Subsection 5.1

Our basic results about continuous-time RTPs are Lemma 7 and Proposition 9. Lemma 7 describes the evolution of the law of the process

[TABLE]

that is constructed by assigning independent values $X_{\mathbf{i}}$ to elements $\mathbf{i}\in\nabla{\mathbb{S}}_{t}$ and then calculating backwards. Proposition 9 says that adding exponential lifetimes to the elements of an RTP yields a stationary version of the process in (5.1). These results are proved in Subsection 5.2.

In Subsection 5.3, we prove continuous-time analogues of known discrete-time results related to endogeny. Following [AB05], Theorem 11 links the $n$ -variate mean-field equation to endogeny, while Propositions 13 and 15 are concerned with the higher-level mean-field equation, and closely follow ideas from [MSS18].

5.1 Construction of RTPs

Proof of Lemma 8 For each finite subtree ${\mathbb{U}}\subset{\mathbb{T}}$ that contains the root, we can construct random variables $(\bm{\omega}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{U}}}$ and $(X_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{U}}\cup\partial{\mathbb{U}}}$ such that the $(\bm{\omega}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{U}}}$ are independent with common law $|{\mathbf{r}}|^{-1}{\mathbf{r}}$ , the $(X_{\mathbf{i}})_{\mathbf{i}\in\partial{\mathbb{U}}}$ are i.i.d. with common law $\nu$ and independent of the $(\bm{\omega}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{U}}}$ , and the $(X_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{U}}}$ are inductively defined by

[TABLE]

The joint law of $(\bm{\omega}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{U}}}$ and $(X_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{U}}\cup\partial{\mathbb{U}}}$ is a probability law ${\mathbb{P}}_{\mathbb{U}}$ on $\Omega^{\mathbb{U}}\times S^{{\mathbb{U}}\cup\partial{\mathbb{U}}}$ . Since $\Omega$ and $S$ are Polish spaces, we can apply Kolmogorov’s extension theorem. The statement of the lemma then follows provided we can show that the laws ${\mathbb{P}}_{\mathbb{U}}$ are consistent in the sense that if ${\mathbb{V}}\subset{\mathbb{U}}$ is another subtree that contains the root, then the projection of ${\mathbb{P}}_{\mathbb{U}}$ on $\Omega^{\mathbb{V}}\times S^{{\mathbb{V}}\cup\partial{\mathbb{V}}}$ equals ${\mathbb{P}}_{\mathbb{V}}$ . It suffices to prove this when ${\mathbb{U}}$ and ${\mathbb{V}}$ differ by one element only, say ${\mathbb{U}}={\mathbb{V}}\cup\{\mathbf{i}\}$ where $\mathbf{i}\in\nabla{\mathbb{V}}$ . It follows from (5.2) and the fact that $\nu$ solves the RDE (1.54) that $X_{\mathbf{i}}$ has law $\nu$ and is independent of $(X_{\mathbf{j}})_{\mathbf{j}\in\nabla{\mathbb{V}}\backslash\{\mathbf{i}\}}$ , and from this we see that the projection of ${\mathbb{P}}_{\mathbb{U}}$ is indeed ${\mathbb{P}}_{\mathbb{V}}$ .

It will be useful in what follows to have a somewhat stronger version of Lemma 8 that applies also to certain random subtrees ${\mathbb{U}}\subset{\mathbb{T}}$ . Let ${\cal T}$ denote the set of all finite subtrees ${\mathbb{U}}\subset{\mathbb{T}}$ such that either $\varnothing\in{\mathbb{U}}$ or ${\mathbb{U}}=\emptyset$ . Let us define a stopping tree to be a random variable ${\mathbb{U}}$ with values in ${\cal T}$ such that

[TABLE]

In the special case that $\kappa\equiv 1$ and ${\mathbb{T}}={\mathbb{N}}$ , a stopping tree is just a stopping time w.r.t. the filtration generated by $\bm{\omega}_{\varnothing},\bm{\omega}_{1},\bm{\omega}_{11},\ldots$ .

Lemma 44 (RTPs and stopping trees)

Let $(\bm{\omega}_{\mathbf{i}},X_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ be an RTP corresponding to a map $\gamma$ and a solution $\nu$ to the RDE (1.54), and let ${\mathbb{U}}\subset{\mathbb{T}}$ be a stopping tree. Then conditional on ${\mathbb{U}}$ , the random variables $(X_{\mathbf{i}})_{\mathbf{i}\in\partial{\mathbb{U}}}$ are i.i.d. with common law $\nu$ and independent of $(\bm{\omega}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{U}}}$ .

**Proof **For each fixed ${\mathbb{V}}\in{\cal T}$ , by Lemma 8, conditional on $(\bm{\omega}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{V}}}$ , the random variables $(X_{\mathbf{i}})_{\mathbf{i}\in\partial{\mathbb{V}}}$ are i.i.d. with common law $\nu$ . By (5.3), it follows that conditional on the event $\{{\mathbb{U}}={\mathbb{V}}\}$ and $(\bm{\omega}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{V}}}$ , the random variables $(X_{\mathbf{i}})_{\mathbf{i}\in\partial{\mathbb{V}}}$ are i.i.d. with common law $\nu$ . Since this holds for all ${\mathbb{V}}\in{\cal T}$ , and since ${\mathbb{U}}\in{\cal T}$ a.s., the claim follows.

5.2 Continuous-time RTPs

In this subsection, we prove Lemma 7 and Proposition 9. We work in our usual set-up as described above Proposition 29. We start with a preparatory lemma that says that if we condition on the $\sigma$ -field ${\cal F}_{t}$ defined in (1.48), then the subtrees of ${\mathbb{S}}$ rooted at $\mathbf{i}\in\nabla{\mathbb{S}}_{t}$ are i.i.d. with the same distribution as ${\mathbb{S}}$ . To formulate this properly, we need some notation.

We call the object

[TABLE]

a marked branching tree. For each $\mathbf{i}\in{\mathbb{S}}$ , let ${\mathbb{S}}^{\mathbf{i}}$ describe the subtree of ${\mathbb{S}}$ that is rooted at $\mathbf{i}$ , i.e.,

[TABLE]

We set $\omega^{\mathbf{i}}_{\mathbf{j}}:=\bm{\omega}_{\mathbf{i}\mathbf{j}}$ $(\mathbf{i},\mathbf{j}\in{\mathbb{T}})$ , so that $\omega^{\mathbf{i}}_{\mathbf{j}}$ is the random element of $\Omega$ that “belongs” to $\mathbf{j}\in{\mathbb{S}}^{\mathbf{i}}$ . Fix $t\geq 0$ . For each $\mathbf{i}\in\nabla{\mathbb{S}}_{t}$ , let $\sigma^{\mathbf{i},t}_{\mathbf{j}}$ describe the lifetime of an individual $\mathbf{j}\in{\mathbb{S}}^{\mathbf{i}}$ after time $t$ , i.e.,

[TABLE]

where $t-\tau^{\ast}_{\mathbf{i}}$ is the age of the individual $\mathbf{i}$ at time $t$ .

Lemma 45 (Memoryless property)

For each $t\geq 0$ , conditional on the $\sigma$ -field ${\cal F}_{t}$ , the marked branching trees

[TABLE]

are i.i.d. with the same distribution as the marked branching tree in (5.4).

**Proof **Let ${\cal T}$ be as defined above (5.3). Then, for each ${\mathbb{V}}\in{\cal T}$ , the event $\{{\mathbb{S}}_{t}={\mathbb{V}}\}$ is measurable w.r.t. the $\sigma$ -field generated by the random variables

[TABLE]

Note that here $\nabla{\mathbb{V}}=\{\mathbf{i}j:\mathbf{i}\in{\mathbb{V}},\ j\leq\kappa(\bm{\omega}_{\mathbf{i}})\}$ is measurable w.r.t. the $\sigma$ -field generated by $(\bm{\omega}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{V}}}$ while for each $\mathbf{i}\in\nabla{\mathbb{V}}$ , the random variable $\tau^{\ast}_{\mathbf{i}}$ is measurable w.r.t. the $\sigma$ -field generated by $(\sigma_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{V}}}$ .

Conditional on $\{{\mathbb{S}}_{t}={\mathbb{V}}\}$ and the random variables in (5.8), the random variables $(\bm{\omega}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}\backslash{\mathbb{V}}}$ are still i.i.d. with their original law and independent of $(\sigma_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}\backslash{\mathbb{V}}}$ . The latter are also still independent of each other and the $(\sigma_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}\backslash({\mathbb{V}}\cup\nabla{\mathbb{V}})}$ still have their original law, but the laws of $(\sigma_{\mathbf{i}})_{\mathbf{i}\in\nabla{\mathbb{V}}}$ are changed since conditioning on $\{{\mathbb{S}}_{t}={\mathbb{V}}\}$ entails conditioning on $\sigma_{\mathbf{i}}>t-\tau^{\ast}_{\mathbf{i}}$ for each $\mathbf{i}\in\nabla{\mathbb{V}}$ .

Since this holds for each ${\mathbb{V}}\in{\mathbb{S}}$ , we see that if we condition on ${\cal F}_{t}$ as in (1.48), then under the conditional law the random variables $\bm{\omega}_{\mathbf{i}}$ and $\sigma_{\mathbf{i}}$ with $\mathbf{i}\in{\mathbb{T}}\backslash{\mathbb{S}}_{t}$ are still independent, and all of these random variables still have their original law, except the $\sigma_{\mathbf{i}}$ with $\mathbf{i}\in\nabla{\mathbb{S}}_{t}$ , whose laws are conditioned on the events $\sigma_{\mathbf{i}}>t-\tau^{\ast}_{\mathbf{i}}$ . From this observation, using the memoryless property of the exponential distribution, the claim of the lemma follows.

For each $s\geq 0$ and $\mathbf{i}\in\nabla{\mathbb{S}}_{s}$ , within the marked branching tree $\big{(}{\mathbb{S}}^{\mathbf{i}},(\omega^{\mathbf{i}}_{\mathbf{j}},\sigma^{\mathbf{i},s}_{\mathbf{j}})_{\mathbf{j}\in{\mathbb{S}}^{\mathbf{i}}}\big{)}$ rooted at $\mathbf{i}$ , we define the birth and death times $\tau^{\mathbf{i},\ast}_{\mathbf{j}}$ and $\tau^{\mathbf{i},\dagger}_{\mathbf{j}}$ as in (1.41), with $\sigma_{\mathbf{j}}$ replaced by $\sigma^{\mathbf{i},s}_{\mathbf{j}}$ , and we use this to define ${\mathbb{S}}^{\mathbf{i},s}_{t}$ and $\nabla{\mathbb{S}}^{\mathbf{i},s}_{t}$ $(t\geq 0)$ as in (1.44). Finally, we define $G^{\mathbf{i},s}_{t}=G_{{\mathbb{S}}^{\mathbf{i},s}_{t}}$ as in (1.46) and (1.47).

Proof of Lemma 7 We fix a marked branching tree as in (5.4) and times $0\leq s\leq t$ . Conditional on ${\cal F}_{t}$ , we assign i.i.d. $(X_{\mathbf{i}})_{\mathbf{i}\in\nabla{\mathbb{S}}_{t}}$ with common law $\mu_{0}$ to the leaves of ${\mathbb{S}}_{t}$ and define $(X_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{S}}_{t}}$ inductively as in (1.50).

We observe that $\nabla{\mathbb{S}}_{t}$ is given by the disjoint union

[TABLE]

Conditioning on ${\cal F}_{t}$ is the same as first conditioning on

[TABLE]

and then conditioning on

[TABLE]

which by Lemma 5.7 are conditionally independent given the random variable in (5.10). Set

[TABLE]

Then

[TABLE]

In view of this, by Theorem 6, conditional on the the random variable in (5.10), i.e., conditional on ${\cal F}_{s}$ , the random variables $(X_{\mathbf{i}})_{\mathbf{i}\in\nabla{\mathbb{S}}_{s}}$ are i.i.d. with common law $\mu_{t-s}$ , where $(\mu_{s})_{s\geq 0}$ denotes the solution of the mean-field equation (1.2) with initial state $\mu_{0}$ .

Proof of Proposition 9 Since $(\sigma_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ and $(\bm{\omega}_{\mathbf{i}},X_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ are independent, the conditional law of $(\bm{\omega}_{\mathbf{i}},X_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ given $(\sigma_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ is the same as the unconditional law. We claim that under the conditional law given $(\sigma_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ , the random finite subtree ${\mathbb{S}}_{t}$ is a stopping tree in the sense of (5.3). Indeed, ${\mathbb{S}}_{t}={\mathbb{V}}$ if and only if for each $\mathbf{i}\in{\mathbb{V}}$ and $j\in{\mathbb{N}}_{+}$ (resp. $j\in[d]$ , depending on how ${\mathbb{T}}$ is chosen), one has $\mathbf{i}j\in{\mathbb{V}}$ if and only if

[TABLE]

Here the event in (i) is clearly measurable w.r.t. $\sigma((\bm{\omega}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{V}}})$ while under the conditional law given $(\sigma_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ , (ii) is just a deterministic condition. We can therefore apply Lemma 44 to conclude that conditional on $(\sigma_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ , ${\mathbb{S}}_{t}$ , and $(\bm{\omega}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{S}}_{t}}$ , the random variables $(X_{\mathbf{i}})_{\mathbf{i}\in\partial{\mathbb{S}}_{t}}$ are i.i.d. with common law $\nu$ .

We observe that

[TABLE]

is a function of ${\mathbb{S}}_{t}$ , and $(\bm{\omega}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{S}}_{t}}$ . Therefore, if we condition on ${\cal F}_{t}=\sigma(\nabla{\mathbb{S}}_{t},(\bm{\omega}_{\mathbf{i}},\sigma_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{S}}_{t}})$ , the random variables $(X_{\mathbf{i}})_{\mathbf{i}\in\nabla{\mathbb{S}}_{t}}$ are i.i.d. with common law $\nu$ . This proves (1.59) (i). Condition (1.59) (ii) is also clearly fulfilled by the definition of an RTP.

5.3 Endogeny, bivariate uniqueness, and the higher-level equation

In this subsection, we prove Theorem 11 and Propositions 13 and 15.

Recall that an RTP $(\bm{\omega}_{\mathbf{i}},X_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ is endogenous if $X_{\varnothing}$ is measurable with respect to the $\sigma$ -field generated by the random variables $(\bm{\omega}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ . In general, if $X$ is a random variable taking values in a Polish space and ${\cal F}$ is a sub- $\sigma$ -field, then it is not hard to see that $X$ is a.s. equal to a ${\cal F}$ -measurable function if and only if the conditional law ${\mathbb{P}}[X\in\,\cdot\,|{\cal F}]$ is a.s. a delta-measure. In view of this, the following lemma implies that an RTP is endogenous if and only if $X_{\varnothing}$ is a.s. measurable w.r.t. the $\sigma$ -field generated by the random variables ${\mathbb{S}}$ and $(\bm{\omega}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{S}}}$ .

Lemma 46 (Relevant randomness)

Let $(\bm{\omega}_{\mathbf{i}},X_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ be an RTP corresponding to a solution $\nu$ of the RDE (1.54). Let $\overline{\cal F}$ be the $\sigma$ -field generated by the random variables $(\bm{\omega}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ and let ${\cal F}$ be the $\sigma$ -field generated by the random variables ${\mathbb{S}}$ and $(\bm{\omega}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{S}}}$ . Then

[TABLE]

**Proof **Since $\overline{\cal F}$ is generated by ${\cal F}$ and the random variables $(\bm{\omega}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}\backslash{\mathbb{S}}}$ , formula (5.16) says that conditional on on ${\cal F}$ , the random variables $(\bm{\omega}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}\backslash{\mathbb{S}}}$ are independent of $X_{\varnothing}$ . Let ${\mathbb{U}}^{(n)}$ be deterministic finite rooted subtrees of ${\mathbb{T}}$ that increase to ${\mathbb{T}}$ . Let $\overline{\cal F}^{(n)}$ be the $\sigma$ -field generated by $(\bm{\omega}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{U}}^{(n)}}$ and let ${\cal F}^{(n)}$ be the $\sigma$ -field generated by ${\mathbb{S}}\cap{\mathbb{U}}^{(n)}$ and $(\bm{\omega}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{S}}\cap{\mathbb{U}}^{(n)}}$ . Conditional on ${\cal F}^{(n)}$ , the state at the root $X_{\varnothing}$ is a deterministic function of $(X_{\mathbf{i}})_{\mathbf{i}\in\nabla({\mathbb{S}}\cap{\mathbb{U}}^{(n)})}$ . Therefore, by point (ii) in the definition of an RTP in Lemma 8, $X_{\varnothing}$ is conditionally independent of $(\bm{\omega}_{\mathbf{i}})_{\mathbf{i}\in({\mathbb{T}}\backslash{\mathbb{S}})\cap{\mathbb{U}}^{(n)}}$ given ${\cal F}^{(n)}$ , or equivalently,

[TABLE]

for each measurable $A\subset S$ . Letting $n\to\infty$ , using martingale convergence, we arrive at (5.16).

The following lemma prepares for the proof of Theorem 11.

Lemma 47 (Successful coupling)

Let $(\bm{\omega}_{\mathbf{i}},X_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ be an endogenous RTP corresponding to a solution $\nu$ of the RDE (1.54) and let $(\sigma_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ be an independent i.i.d. collection of exponential random variables with mean $|{\mathbf{r}}|^{-1}$ . Furthermore, let $(Y_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ be an i.i.d. collection of $S$ -valued random variables with common law $\nu$ , independent of $(\bm{\omega}_{\mathbf{i}},X_{\mathbf{i}},\sigma_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ . For each $t>0$ , define random variables $(X^{t}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{S}}_{t}\cup\nabla{\mathbb{S}}_{t}}$ by

[TABLE]

Then

[TABLE]

**Proof **The following argument is a continuous-time version of the proofs of [AB05, Thm 11 (c)] and [MSS18, Lemma 6]. Let ${\cal F}_{t}$ be the filtration defined in (1.48). We add a final element ${\cal F}_{\infty}:=\sigma(\bigcup_{t\geq 0}{\cal F}_{t})$ to the filtration, which is the $\sigma$ -algebra generated by the random tree ${\mathbb{S}}$ and the random variables $(\bm{\omega}_{\mathbf{i}},\sigma_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{S}}}$ . Let $f,g:S\to{\mathbb{R}}$ be bounded and measurable functions. Since $X_{\varnothing}$ and $X^{t}_{\varnothing}$ are conditionally independent and identically distributed given ${\cal F}_{t}$ , we have

[TABLE]

where we used the martingale convergence and in the last equality also endogeny and Lemma 46. Since (5.20) holds in particular for any bounded continuous $f$ and $g$ , we conclude that the law of $(X_{\varnothing},X_{\varnothing}^{t})$ converges weakly to the law of $(X_{\varnothing},X_{\varnothing})$ , which implies (5.19).

Proof of Theorem 11 If (ii) holds, then $\overline{\nu}^{(2)}$ is the only fixed point in ${\cal P}(S^{2})_{\nu}$ of the bivariate mean-field equation. Since a measure is a fixed point of the bivariate mean-field equation if and only if it is a fixed point of the map $T^{(2)}$ , by Theorem 10, it follows that the RTP corresponding to $\nu$ is endogenous.

Assume, conversely, that the RTP corresponding to $\nu$ is endogenous. Let $(Y_{\mathbf{i}}^{1},\ldots,Y_{\mathbf{i}}^{n})_{\mathbf{i}\in{\mathbb{T}}}$ be a collection of i.i.d. $S^{n}$ -valued random variables with common law $\mu^{(n)}_{0}$ , independent of the RTP $(\bm{\omega}_{\mathbf{i}},X_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ and the exponential lifetimes $(\sigma_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ . For each $1\leq m\leq n$ and $t>0$ , define random variables $(X^{m,t}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{S}}_{t}\cup\nabla{\mathbb{S}}_{t}}$ by

[TABLE]

Then, by Theorem 6 applied to the $n$ -variate map $\gamma^{(n)}$ , we see that $(X_{\varnothing}^{1,t},\ldots,X_{\varnothing}^{n,t})$ has law $\mu^{(n)}_{t}$ . By endogeny we get from Lemma 47 that

[TABLE]

This completes the proof since the right-hand side of (5.22) has law $\overline{\nu}^{(n)}$ as defined in (1.62).

Proof of Proposition 13 The fact that $(\rho_{t})_{t\geq 0}$ solves the higher-level mean-field equation (1.71) means that

[TABLE]

for any bounded measurable $\phi:{\cal P}(S)\to{\mathbb{R}}$ . In particular, we can apply this to functions of the form

[TABLE]

where $f:S^{n}\to{\mathbb{R}}$ is bounded and measurable. Then

[TABLE]

where $(T_{\check{\gamma}[\omega]}(\rho_{t}))^{(n)}$ denotes the $n$ -th moment measure of $T_{\check{\gamma}[\omega]}(\rho_{t})$ . By [MSS18, Lemma 2],

[TABLE]

Inserting this into (5.23), we see that $(\rho^{(n)}_{t})_{t\geq 0}$ solves the $n$ -variate mean-field equation.

The following lemma prepares for the proof of Proposition 15.

Lemma 48 (Conditional law of the root)

Let $(\bm{\omega}_{\mathbf{i}},X_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ be an RTP corresponding to a solution $\nu$ of the RDE (1.54), let $(\sigma_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ be an independent i.i.d. collection of exponentially distributed random variables with mean $|{\mathbf{r}}|^{-1}$ , and let $({\cal F}_{t})_{t\geq 0}$ be the filtration defined in (1.48). Then the measures

[TABLE]

solve the higher-level mean-field equation (1.71) with initial state $\rho_{0}=\delta_{\nu}$ .

**Proof **Conditional on ${\cal F}_{t}$ , the map $G_{t}:S^{\nabla{\mathbb{S}}_{t}}\to S$ is a deterministic map, and $(X_{\mathbf{i}})_{\mathbf{i}\in\nabla{\mathbb{S}}_{t}}$ are i.i.d. with common law $\nu$ . Therefore, applying [MSS18, Lemma 8] to the case that the $\sigma$ -fields ${\cal H}_{k}$ there are all trivial and the probability measure ${\mathbb{P}}$ there is replaced by the conditional law given ${\cal F}_{t}$ , we see that

[TABLE]

Now by Theorem 6,

[TABLE]

solves the higher-level mean-field equation (1.71) with initial state $\rho_{0}=\delta_{\nu}$ .

Proof of Proposition 15 Let $(\rho^{i}_{t})_{t\geq 0}$ $(i=1,2)$ be solutions to the higher-level mean-field equation (1.71) such that $\rho^{1}_{0}\leq_{\rm cv}\rho^{2}_{0}$ . Define $\rho^{i}_{t,(n)}$ as in (3.22), with $T$ replaced by the higher-level map $\check{T}$ from (1.73). It has been shown in [MSS18, Prop 3] that $\check{T}$ is monotone w.r.t. the convex order, so by induction we obtain from (3.22) that $\rho^{1}_{t,(n)}\leq_{\rm cv}\rho^{2}_{t,(n)}$ for all $n\geq 1$ and $t\geq 0$ . Letting $n\to\infty$ , using (3.30), we see that $\rho^{1}_{t}\leq_{\rm cv}\rho^{2}_{t}$ for all $t\geq 0$ .

Let $\nu$ be a solution of the RDE (1.54). It has been shown in [MSS18, Prop. 3] that $\overline{\nu}$ solves the higher-level RDE (1.73) and there exists a (necessarily unique) solution $\underline{\nu}$ of (1.73) such that (1.76) holds. It has moreover been shown in [MSS18, Prop. 4] that $\underline{\nu}$ is given by (1.79). In view of this, to complete the proof, it suffices to show that the solution $(\rho_{t})_{t\geq 0}$ to the higher-level mean-field equation (1.71) with initial state $\rho_{0}=\delta_{\nu}$ converges to the measure in (1.79).

We apply Lemma 48. As in the proof of Lemma 47, we add a final element ${\cal F}_{\infty}:=\sigma(\bigcup_{t\geq 0}{\cal F}_{t})$ to the filtration, which is the $\sigma$ -algebra generated by the random tree ${\mathbb{S}}$ and the random variables $(\bm{\omega}_{\mathbf{i}},\sigma_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{S}}}$ . Then, by martingale convergence,

[TABLE]

and hence the measures $\rho_{t}$ in (5.27) satisfy

[TABLE]

where $\Rightarrow$ denotes weak convergence of probability measures on ${\cal P}(S)$ , which is in turn equipped with the topology of weak convergence of probability measures on $S$ . Since the exponentially distributed random variables $(\sigma_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ are independent of the RTP $(\bm{\omega}_{\mathbf{i}},X_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ , we have

[TABLE]

where as in Lemma 46 ${\cal F}$ denotes the $\sigma$ -field generated by the random variables ${\mathbb{S}}$ and $(\bm{\omega}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{S}}}$ and the last equality follows from that lemma. Inserting this into (5.31) we see that $\rho_{t}$ converges weakly to $\underline{\nu}$ as defined in (1.79).

6 Further results

In this section, we prove some additional results about RTPs. In Subsection 6.1, we prove Proposition 19 about the upper and lower solutions of a monotonous RDE. In Subsection 6.2 we prove Lemmas 20, 21, and 23 which give conditions for uniqueness of solutions to an RDE. Subsection 6.3 is devoted to the proof of Lemma 24.

6.1 Monotonicity

In this subsection, we prove Proposition 19. We start with a number of simple lemmas.

Lemma 49 (A continuous monotone function)

Let $S$ be a compact metrizable space that is equipped with a closed partial order in the sense of (1.90), and let $d$ be a metric that generates the topology. Then

[TABLE]

defines a continuous function $f:S^{2}\to{[0,\infty)}$ such that $f(x,y)=0$ if and only if $x\geq y$ and moreover $f(x,y)$ is decreasing in $x$ and increasing in $y$ .

**Proof **Since for any $(x,y),(x^{\prime},y^{\prime})\in S^{2}$ ,

[TABLE]

the function $d:S^{2}\to{[0,\infty)}$ is continuous. Assume that $(x_{n},y_{n})\in S^{2}$ converge to a limit $(x,y)$ . Since the infimum of a family of continuous functions is upper semi-continuous, we have

[TABLE]

To prove that $f$ is actually continuous, assume the converse. Then there exists a sequence such that

[TABLE]

for some $\varepsilon>0$ . By the definition of $f$ , there exist $x^{\prime}_{n}\leq x_{n}$ and $y^{\prime}_{n}\geq y_{n}$ such that $d(x^{\prime}_{n},y^{\prime}_{n})\leq f(x_{n},y_{n})+\varepsilon/2$ . Since $S$ is compact, we can select a subsequence such that (6.4) still holds and the $(x^{\prime}_{n},y^{\prime}_{n})$ converge to a limit $(x^{\prime},y^{\prime})$ . Since the partial order is closed in the sense of (1.90), we have $x^{\prime}\leq x$ and $y^{\prime}\geq y$ , so

[TABLE]

which contradicts (6.4). We conclude that $f:S^{2}\to{[0,\infty)}$ is continuous.

If $x\geq y$ , then setting $(x^{\prime},y^{\prime})=(x,x)$ shows that $f(x,y)=0$ . Conversely, if $f(x,y)=0$ then there exist $x_{n}\leq x$ and $y_{n}\geq y$ such that $d(x_{n},y_{n})\to 0$ . Using the compactness of $S$ , by going to a subsequence, we can assume that the $(x_{n},y_{n})$ converge to a limit $(z,z)$ . Since the partial order is closed in the sense of (1.90), $y\leq z\leq x$ and hence $x\geq y$ .

If $x\leq x_{\ast}$ and $y\geq y_{\ast}$ , then

[TABLE]

since the second infimum is taken over a smaller set, showing that $f(x,y)$ is decreasing in $x$ and increasing in $y$ .

Lemma 50 (Comparison principle)

Let $S$ be a compact metrizable space that is equipped with a partial order that is closed in the sense of (1.90). Let $X,Y$ be $S$ -valued random variables such that $X\leq Y$ a.s. and ${\mathbb{P}}[X\in\,\cdot\,]\geq{\mathbb{P}}[Y\in\,\cdot\,]$ . Then $X=Y$ a.s.

Proof of Lemma 50 Set $f_{z}(x):=f(z,x)$ with $f$ as in Lemma 49. Then, for each $z\in S$ , $f_{z}:S\to{[0,\infty)}$ is continuous and monotone increasing, and $f_{z}(x)=0$ if and only if $x\leq z$ . Let

[TABLE]

We will prove the lemma by showing that if $X,Y$ are $S$ -valued random variables such that ${\mathbb{P}}[(X,Y)\in S^{2}_{<}]>0$ , then ${\mathbb{E}}[f_{z}(X)]<{\mathbb{E}}[f_{z}(Y)]$ for some $z\in S$ contradicting ${\mathbb{P}}[X\in\,\cdot\,]\geq{\mathbb{P}}[Y\in\,\cdot\,]$ . For each $z\in S$ and $\delta>0$ , we define an open set $O_{z,\delta}\subset S^{2}$ by

[TABLE]

Since for each $(x,y)\in S^{2}_{<}$ , one has $f_{x}(x)=0$ but $f_{x}(y)>0$ , we see that

[TABLE]

We now use the inner regularity of measures on Polish spaces w.r.t. compacta, which follows from the regularity and tightness of any probability measure on a Polish space [Par05, Thm. 1.2 and 3.2]. Thus, we can find a compact set $K\subset S^{2}_{<}$ such that ${\mathbb{P}}[(X,Y)\in K]>0$ . Since $K$ is compact, it is covered by finitely many sets of the form (6.8), so there must exists a $z\in S$ and $\delta>0$ such that ${\mathbb{P}}[(X,Y)\in O_{z,\delta}]>0$ . Since $f_{z}$ is monotone increasing and $X\leq Y$ it follows that ${\mathbb{E}}[f_{z}(X)]<{\mathbb{E}}[f_{z}(Y)]$ .

Lemma 51 (Compatibility of the stochastic order)

Assume that $S$ is equipped with a partial order that is closed in the sense of (1.90). Then the stochastic order on ${\cal P}(S)$ is closed with respect to the topology of weak convergence.

**Proof **We need to show that if $\mu^{1}_{n}\leq\mu^{2}_{n}$ for all $n\in{\mathbb{N}}$ and the $\mu^{i}_{n}\in{\cal P}(S)$ converge weakly as $n\to\infty$ to a limit $\mu^{i}_{\infty}$ $(i=1,2)$ , then $\mu^{1}_{\infty}\leq\mu^{2}_{\infty}$ . Since $\mu^{1}_{n}\leq\mu^{2}_{n}$ , for each $n$ , we can couple $X^{i}_{n}$ with laws $\mu^{i}_{n}$ $(i=1,2)$ such that $X^{1}_{n}\leq X^{2}_{n}$ . Since $\mu^{1}_{n}$ and $\mu^{2}_{n}$ converge as $n\to\infty$ , the joint laws of $(X^{1}_{n},X^{2}_{n})$ are tight, so by going to a subsequence we may assume that they converge. Then, by Skorohod’s representation theorem, we can couple the random variables $(X^{1}_{n},X^{2}_{n})$ for different $n$ in such a way that they converge a.s. to a limit $(X^{1}_{\infty},X^{2}_{\infty})$ . Since the partial order on $S$ is closed, we have $X^{1}_{\infty}\leq X^{2}_{\infty}$ a.s., proving that $\mu^{1}_{\infty}\leq\mu^{2}_{\infty}$ .

Lemma 52 (Monotonicity of $T$ )

Assume that $S$ is equipped with a partial order that is closed and that $\gamma[\omega]$ is monotone for all $\omega\in\Omega$ . Then the operator $T$ in (1.1) is monotone w.r.t. the stochastic order.

**Proof **If $\mu_{1}\leq\mu_{2}$ , then we can couple random variables $X^{1}$ and $X^{2}$ with laws $\mu_{1},\mu_{2}$ such that $X^{1}\leq X^{2}$ . Let $(X^{1}_{i},X^{2}_{i})_{i\geq 1}$ be i.i.d. copies of $(X^{1},X^{2})$ . Then

[TABLE]

for all $\omega\in\Omega$ and hence $T(\mu_{1})\leq T(\mu_{2})$ by (1.1).

In practice, Lemma 52 is the usual way to prove monotonocity of a map of the form (1.1). Nevertheless, it is known that there are maps of the form (1.1), in particular, probability kernels, that are monotone yet cannot be represented in terms of monotone maps [FM01, Example 1.1].

Lemma 53 (Monotonicity in the initial state)

Assume that $S$ is equipped with a partial order that is closed and that the operator $T$ in (1.1) is monotone w.r.t. the stochastic order. Then solutions $(\mu^{i}_{t})_{t\geq 0}$ $(i=1,2)$ of the mean-field equation (1.2) started in initial states $\mu^{1}_{0}\leq\mu^{2}_{0}$ satisfy $\mu^{1}_{t}\leq\mu^{2}_{t}$ $(t\geq 0)$ .

**Proof **Inductively define $\mu^{i}_{t,(n)}$ as in (3.22) with $\mu_{0}$ replaced by $\mu^{i}_{0}$ $(i=1,2)$ . Then $\mu^{1}_{t,(n)}\leq\mu^{2}_{t,(n)}$ for all $n\geq 1$ and $t\geq 0$ . Letting $n\to\infty$ , we see as in the proof of Proposition 29 that $\mu^{i}_{t,(n)}\Rightarrow\mu^{i}_{t}$ as $n\to\infty$ . By Lemma 51, we conclude that $\mu^{1}_{t}\leq\mu^{2}_{t}$ $(t\geq 0)$ .

In the next two lemmas we need to assume compactness of $S$ .

Lemma 54 (Increasing limits)

Assume that $S$ is a compact metrizable space equipped with a partial order that is closed. Then every increasing sequence in $S$ converges to a limit.

**Proof **Let $(x_{n})_{n\in{\mathbb{N}}}$ be a sequence in $S$ such that $x_{n}\leq x_{n+1}$ for all $n\in{\mathbb{N}}$ . By compactness, it suffices to prove that all subsequential limits are the same. Let $(x_{m})_{m\in M}$ and $(x_{k})_{k\in K}$ be subsequences that converge to limits $x$ and $x^{\prime}$ , respectively. For all $k\in K$ , let $k_{-}:=\sup\{m\in M:m\leq k\}$ . Then $x_{k_{-}}\leq x_{k}$ for all $k\in K$ and letting $k\to\infty$ , using the compatibility condition (1.90), we see that $x\leq x^{\prime}$ . The same argument gives $x^{\prime}\leq x$ and hence $x=x^{\prime}$ .

Lemma 55 (Increasing limits in the stochastic order)

Assume that $S$ is a compact metrizable space equipped with a partial order that is closed. Then every sequence in ${\cal P}(S)$ that is increasing in the stochastic order converges weakly to a limit.

**Proof **Let $(\mu_{n})_{n\geq 0}$ be increasing in the stochastic order. Then, for each $n\geq 1$ , we can couple random variables $X_{n-1}$ and $X_{n}$ with laws $\mu_{n-1}$ and $\mu_{n}$ such that ${\mathbb{P}}[X_{n-1}\leq X_{n}]=1$ . Let $K_{n}(x,\mathrm{d}y):={\mathbb{P}}[X_{n}\in\,\mathrm{d}y\,|\,X_{n-1}=x]$ and let $(Y_{n})_{n\geq 0}$ be a time-inhomogeneous Markov chain with initial law $\mu_{0}$ and transition kernels ${\mathbb{P}}[Y_{n}\in\,\mathrm{d}y\,|\,Y_{n-1}=x]=K_{n}(x,\mathrm{d}y)$ . Then $Y_{n-1}\leq Y_{n}$ a.s. for all $n\geq 1$ and hence the $Y_{n}$ a.s. increase to a limit $Y_{\infty}$ by Lemma 54. It follows that the $\mu_{n}$ converge weakly to the law of $Y_{\infty}$ .

We now turn to the proof of Proposition 19.

Lemma 56 (Lower and upper solutions)

All conclusions of Proposition 19 except for the statement about endogeny hold when the assumption that $\gamma[\omega]$ is monotone for all $\omega\in\Omega$ is replaced by the weaker condition that $T$ is monotone.

**Proof **The proof is similar to the proof of [AB05, Lemma 15], which in turn is based on well-known principles [Lig85, Thm III.2.3]. By symmetry, it suffices to prove the statement for $\nu_{\rm low}$ .

Since for each $s\geq 0$ , $(\mu^{\rm low}_{s+t})_{t\geq 0}$ solves (1.2) with initial state $\mu^{\rm low}_{s}\geq\delta_{0}$ , we conclude from Lemmas 52 and 53 that $\mu^{\rm low}_{s+t}\geq\mu^{\rm low}_{t}$ for each $s\geq 0$ and hence $t\mapsto\mu^{\rm low}_{t}$ is increasing w.r.t. the stochastic order. By Lemma 55, it follows that $\mu^{\rm low}_{t}\Rightarrow\nu_{\rm low}$ for some probability measure $\nu_{\rm low}$ on $S$ . Since $\mu^{\rm low}_{s+t}\Rightarrow\nu_{\rm low}$ for all $s\geq 0$ , using Lemma 32 and the continuity of $T$ , we see that $\nu_{\rm low}$ is a fixed point of the mean-field equation (1.2) and hence solves the RDE (1.54).

If $\nu$ is any solution of the RDE (1.54), then $\mu^{\rm low}_{t}\leq\nu$ for all $t\geq 0$ by Lemma 53 and the fact that $\nu$ is a fixed point of (1.2). Letting $t\to\infty$ , using Lemma 51, we see that $\nu_{\rm low}\leq\nu$ .

Lemma 57 (Random maps applied to extremal elements)

Under the assumptions of Proposition 19, if $(\omega_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ are i.i.d. with common law $|{\mathbf{r}}|^{-1}{\mathbf{r}}$ , then there exist random variables $X^{\rm upp}_{\varnothing}$ and $X^{\rm low}_{\varnothing}$ with laws $\nu_{\rm upp}$ and $\nu_{\rm low}$ that are given by the decreasing, resp. increasing limits

[TABLE]

where the limit does not depend on the choice of the sequence ${\mathbb{U}}^{(n)}\in{\cal T}$ such that ${\mathbb{U}}^{(n)}\uparrow{\mathbb{S}}$ . Here ${\cal T}$ denotes the set of all finite subtrees ${\mathbb{U}}\subset{\mathbb{T}}$ such that either $\varnothing\in{\mathbb{U}}$ or ${\mathbb{U}}=\emptyset$ , and for each ${\mathbb{U}}\in{\cal T}$ , the random map $G_{\mathbb{U}}:S^{\nabla{\mathbb{U}}}\to S$ is defined in (1.46).

**Proof **By symmetry, it suffices to prove the statement for $X^{\rm low}_{\varnothing}$ . Since $\gamma[\omega]$ is monotone for each $\omega\in\Omega$ , the map $G_{\mathbb{U}}$ is monotone for each ${\mathbb{U}}\in{\cal T}$ . Define

[TABLE]

Then $X^{\mathbb{U}}_{\varnothing}\leq X^{\mathbb{V}}_{\varnothing}$ for all ${\mathbb{U}}\subset{\mathbb{V}}$ and hence if ${\mathbb{U}}^{(n)}\in{\cal T}$ increase to ${\mathbb{S}}$ , then the $X^{{\mathbb{U}}^{(n)}}_{\varnothing}$ increase to a limit $X^{\rm low}_{\varnothing}$ that does not depend on the choice of the sequence ${\mathbb{U}}^{(n)}$ .

Let $(\sigma_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ be an independent i.i.d. collection of exponential random variables with mean $|{\mathbf{r}}|^{-1}$ . Define ${\mathbb{S}}_{t}\in{\cal T}$ as in (1.44). Then by Theorem 6, $X^{{\mathbb{S}}_{t}}_{\varnothing}$ has law $\mu^{\rm low}_{t}$ while by what we have already proved $X^{{\mathbb{S}}_{t}}_{\varnothing}$ increases to $X^{\rm low}_{\varnothing}$ . Since $\mu^{\rm low}_{t}\Rightarrow\nu_{\rm low}$ , it follows that $X^{\rm low}_{\varnothing}$ has law $\nu_{\rm low}$ .

Proof of Proposition 19 In view of Lemma 56, it only remains to prove the statement about endogeny. Let $(\bm{\omega}_{\mathbf{i}},X_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ be an RTP corresponding to $\gamma$ and some solution $\nu$ to the RDE (1.54). Then

[TABLE]

with $X^{\mathbb{U}}_{\varnothing}$ as in (6.12). So letting ${\mathbb{U}}\uparrow{\mathbb{T}}$ , using the fact that the partial order is closed, we obtain that $X_{\varnothing}\geq X^{\rm low}_{\varnothing}$ . In particular, if $\nu=\nu_{\rm low}$ , then since $X^{\rm low}_{\varnothing}$ also has law $\nu_{\rm low}$ , Lemma 50 tells us that $X_{\varnothing}=X^{\rm low}_{\varnothing}$ a.s. Since the latter is measurable w.r.t. the $\sigma$ -field generated by the $(\bm{\omega}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ , this proves the endogeny of the RTP corresponding to $\gamma$ and $\nu_{\rm low}$ .

6.2 Conditions for uniqueness

In this subsection, we prove Lemmas 20, 21, and 23.

Proof of Lemma 20 If $G_{t}$ is constant then ${\mathbb{S}}_{t}$ is a root determining subtree, proving the implication (i) $\Rightarrow$ (ii). Conversely, if there a.s. exists a root determining subtree ${\mathbb{U}}$ , then, since ${\mathbb{S}}_{t}\uparrow{\mathbb{S}}$ , there a.s. exists a (random) $t<\infty$ such that ${\mathbb{S}}_{t}\supset{\mathbb{U}}$ and hence $G_{s}$ is constant for all $s\geq t$ . The implication (iii) $\Rightarrow$ (ii) is trivial. Conversely, if ${\mathbb{S}}$ contains a root determining subtree ${\mathbb{U}}$ , then by the finiteness of the latter we can keep removing elements from ${\mathbb{U}}$ as long as this is still possible while retaining the property that ${\mathbb{U}}$ is root determining.

Proof of Lemma 21 (i) $\Rightarrow$ (ii): This is clear, since a finite uniquely determined subtree is root determining.

(ii) $\Rightarrow$ (iii): For each $\mathbf{i}\in{\mathbb{S}}$ , let ${\mathbb{S}}^{\mathbf{i}}$ , defined in (5.5), denote the subtree of ${\mathbb{S}}$ that is rooted at $\mathbf{i}$ . Since ${\mathbb{S}}^{\mathbf{i}}$ is equally distributed with ${\mathbb{S}}$ , by (ii), for each $\mathbf{i}\in{\mathbb{S}}$ , there a.s. exists a root determining subtree ${\mathbb{U}}^{\mathbf{i}}\subset{\mathbb{S}}^{\mathbf{i}}$ . Since $x\in\Xi_{\mathbb{S}}$ implies $x_{\mathbf{i}}=G_{{\mathbb{U}}^{\mathbf{i}}}\big{(}(x_{\mathbf{i}\mathbf{j}})_{\mathbf{j}\in\nabla{\mathbb{U}}^{\mathbf{i}}}\big{)}$ and $G_{{\mathbb{U}}^{\mathbf{i}}}$ is constant, it follows that ${\mathbb{S}}$ is a.s. uniquely determined.

(ii) $\Rightarrow$ (v): Since $G_{{\mathbb{U}}^{\mathbf{i}}}$ is constant, we can define

[TABLE]

where the right-hand side does not depend on the choice of $(x_{\mathbf{i}\mathbf{j}})_{\mathbf{j}\in\nabla{\mathbb{U}}^{\mathbf{i}}}$ . It is straightforward to check that $(\bm{\omega}_{\mathbf{i}},X_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ satisfies conditions (i)–(iii) of Lemma 8 and hence is an RTP corresponding to $\gamma$ . It follows that $\nu:={\mathbb{P}}[X_{\varnothing}\in\,\cdot\,]$ solves the RDE (1.54).

Let $(Y_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ be an independent i.i.d. collection of $S$ -valued random variables with common law $\mu_{0}$ , let $(\sigma_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ be an independent i.i.d. collection of exponential random variables with mean $|{\mathbf{r}}|^{-1}$ , and define $X^{t}_{\varnothing}:=G_{t}\big{(}(Y_{\mathbf{i}})_{\mathbf{i}\in\nabla{\mathbb{S}}_{t}}\big{)}$ . Then $X^{t}_{\varnothing}$ has law $\mu_{t}$ by Theorem 6. Since $G_{t}=G_{{\mathbb{S}}_{t}}$ with ${\mathbb{S}}_{t}\uparrow{\mathbb{S}}$ , we see from (6.14) that ${\mathbb{P}}[X^{t}_{\varnothing}\neq X_{\varnothing}]\to 0$ as $t\to\infty$ , proving that $\|\mu_{t}-\nu\|\to 0$ .

(iii) $\Rightarrow$ (iv): We note the following general principle: if $S_{1},S_{2},S_{3}$ are Polish spaces and $(X_{1},X_{2})$ and $(X^{\prime}_{1},X_{3})$ are random variables taking values in $S_{1}\times S_{2}$ resp. $S_{1}\times S_{3}$ such that $X_{1}$ and $X^{\prime}_{1}$ are equal in law, then we can couple $(X_{1},X_{2})$ and $(X^{\prime}_{1},X_{3})$ such that $X_{1}=X^{\prime}_{1}$ . To see this, let $\mu$ denote the law of $X_{1}$ , let $K_{i}(x_{1},\mathrm{d}x_{i})$ denote a regular version of the conditional law of $X_{i}$ given $X_{1}$ resp. $X^{\prime}_{1}$ $(i=1,2)$ , and define the joint law of $X_{1},X_{2},X_{3}$ as

[TABLE]

i.e., make $X_{2}$ and $X_{3}$ conditionally independent given $X_{1}$ . Applying this general principle, we see that if $\nu_{1},\nu_{2}$ are solutions to the RDE (1.54), then we can couple the associated RTPs $(\bm{\omega}_{\mathbf{i}},X^{1}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ and $(\bm{\omega}^{\prime}_{\mathbf{i}},X^{2}_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{T}}}$ in such a way that $\bm{\omega}_{\mathbf{i}}=\bm{\omega}^{\prime}_{\mathbf{i}}$ for all $\mathbf{i}\in{\mathbb{T}}$ . Since ${\mathbb{S}}$ is a.s. uniquely determined, it follows that $X^{1}_{\varnothing}=X^{2}_{\varnothing}$ a.s. and hence $\nu_{1}=\nu_{2}$ . The same argument also shows that any solution to the bivariate RDE is concentrated on the diagonal, which by Theorem 10 implies endogeny.

(iii) and $S$ finite imply (ii): Since ${\mathbb{V}}\subset{\mathbb{U}}$ and ${\mathbb{V}}$ root determining imply that ${\mathbb{U}}$ is root determining, we see that ${\mathbb{P}}[G_{t}\mbox{ not constant}]$ decreases to ${\mathbb{P}}[G_{t}\mbox{ not constant }\forall t\geq 0]$ . Assume that this event has positive probability and condition on it. Choose $t(n)\to\infty$ . Then there exist $x^{n},y^{n}\in\Xi_{{\mathbb{S}}_{t(n)}}$ such that $x^{n}_{\varnothing}\neq y^{n}_{\varnothing}$ . Since $S$ is finite, the sequences $x^{n}$ and $y^{n}$ have subsequences that converge pointwise for each $\mathbf{i}\in{\mathbb{S}}$ to limits $x^{\infty},y^{\infty}$ . It is easy to see that $x^{\infty},y^{\infty}\in\Xi_{\mathbb{S}}$ . Moreover, $x^{\infty}_{\varnothing}\neq y^{\infty}_{\varnothing}$ . This shows that on the event $\{G_{t}\mbox{ not constant }\forall t\geq 0\}$ , the tree ${\mathbb{S}}$ is not uniquely determined.

(ii) and $S=\{0,1\}$ imply (i): It suffices to show that each root determining subtree ${\mathbb{U}}$ of ${\mathbb{S}}$ contains a uniquely determined subtree. For any $\mathbf{i}\in{\mathbb{T}}$ , let ${\mathbb{T}}^{(\mathbf{i})}:=\{\mathbf{i}\mathbf{j}:\mathbf{j}\in{\mathbb{T}}\}$ denote $\mathbf{i}$ and its descendants, and let $\Xi_{{\mathbb{U}},\mathbf{i}}$ denote the set of all $(x_{\mathbf{j}})_{\mathbf{j}\in({\mathbb{U}}\cup\nabla{\mathbb{U}})\cap{\mathbb{T}}^{(\mathbf{i})}}$ that satisfy

[TABLE]

Define $(\chi_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{U}}\cup\nabla{\mathbb{U}}}$ by

[TABLE]

We claim that

[TABLE]

Indeed, this follows from the fact that the sets $({\mathbb{U}}\cup\nabla{\mathbb{U}})\cap{\mathbb{T}}^{(\mathbf{i})}$ for different $\mathbf{i}\in\nabla{\mathbb{V}}$ are mutually disjoint, which allows us to choose $x\in\Xi_{{\mathbb{U}},\mathbf{i}}$ independently for each $\mathbf{i}\in\nabla{\mathbb{V}}$ .

Note that $\chi_{\mathbf{i}}=S$ for $\mathbf{i}\in\nabla{\mathbb{U}}$ and $|\chi_{\varnothing}|=1$ if ${\mathbb{U}}$ is root determining. Let ${\mathbb{V}}$ be the connected component of $\{\mathbf{i}\in{\mathbb{U}}:|\chi_{\mathbf{i}}|=1\big{\}}$ that contains $\varnothing$ . Since $S$ has only two elements, $\chi_{\mathbf{i}}=S$ for $\mathbf{i}\in\nabla{\mathbb{V}}$ . Using (6.18), it follows that ${\mathbb{V}}$ is uniquely determined.

For completeness, we give three examples to show that the implications (ii) $\Rightarrow$ (i), (iii) $\Rightarrow$ (ii), and (v) $\Rightarrow$ (ii) do not hold in general. In all of these examples, $\kappa(\omega)=1$ for all $\omega\in\Omega$ , which means ${\mathbb{S}}=\{\varnothing,1,11,111,\ldots\}=\{1_{(n)}:n\geq 0\}$ , where $1_{(n)}$ denotes the word of length $n$ made from the alphabet $\{1\}$ . It follows that the operator $T$ from (1.1) is just the linear operator associated with the transition kernel of a Markov chain. In all our examples, we take $\gamma[\omega]=g$ for all $\omega\in\Omega$ , where $g:S\to S$ is a fixed map.

Example 58 ((ii) $\not\Rightarrow$ (i))

Let $S=\{0,1,2\}$ and $g(x):=(x-1)\vee 0$ . Then ${\mathbb{S}}$ a.s. contains a root determining subtree but ${\mathbb{S}}$ a.s. does not contain a uniquely determined subtree.

**Proof **The subtree ${\mathbb{U}}:=\{\varnothing,1\}$ is root determining, since $G_{\mathbb{U}}(x)=(x-2)\vee 0=0$ for all $x\in S^{\nabla{\mathbb{U}}}=S$ . On the other hand, if ${\mathbb{V}}=\{\varnothing,1,11,\ldots,1_{(n)}\}$ is a finite subtree of ${\mathbb{S}}$ that contains the root, then there exist $x,y\in\Xi_{\mathbb{V}}$ with $x_{1_{(n)}}=0$ and $y_{1_{(n)}}=1$ , which shows ${\mathbb{V}}$ is not uniquely determined.

Example 59 ((iii) $\not\Rightarrow$ (ii))

Let $S={\mathbb{N}}$ and

[TABLE]

Then ${\mathbb{S}}$ is a.s. uniquely determined but ${\mathbb{S}}$ a.s. contains no root determining subtree.

**Proof **If $x\in\Xi_{\mathbb{S}}$ satisfies $x_{1_{(n)}}=m\neq 0$ for some $n$ , then $x_{1_{(n+k)}}\neq 0$ and $x_{1_{(n+k)}}=m-k$ for all $k\geq 0$ , which leads to a contradiction. It follows that $\Xi_{\mathbb{S}}$ contains a single element, which is given by $x_{1_{(n)}}=0$ for all $n\geq 0$ . In particular, ${\mathbb{S}}$ is uniquely determined. On the other hand, for each finite subtree ${\mathbb{U}}=\{\varnothing,1,11,\ldots,1_{(n-1)}\}$ that contains the root, the function $G_{\mathbb{U}}$ is of the form $G_{\mathbb{U}}(0)=0$ and $G_{\mathbb{U}}(x)=x+n$ $(x\geq 1)$ , which is clearly not constant.

Example 60 ((v) $\not\Rightarrow$ (ii))

Let $S=\{0,1\}$ and $g(x):=1-x$ with $\pi(\{g\})>0$ . Then the RDE (1.54) has a solution $\nu$ that is globally attractive in the sense that any solution $(\mu_{t})_{t\geq 0}$ to (1.2) satisfies $\|\mu_{t}-\nu\|\underset{{t}\to\infty}{\longrightarrow}0$ , where $\|\,\cdot\,\|$ denotes the total variation norm. Nevertheless, ${\mathbb{S}}$ contains no root determining subtree.

**Proof **Since the continuous-time Markov chain that jumps from $x$ to $1-x$ with rate $\pi(\{g\})$ is ergodic, the RDE (1.54) has a solution $\nu$ that is globally attractive. On the other hand, if ${\mathbb{U}}=\{\varnothing,1,11,\ldots,1_{(n-1)}\}$ is a finite subtree that contains the root, then $G_{\mathbb{U}}(x)=x$ if $n$ is even and $G_{\mathbb{U}}(x)=1-x$ if $n$ is odd, so $G_{\mathbb{U}}$ is not constant.

Proof of Lemma 23 By Lemma 21, it suffices to prove that if the RDE (1.54) has a unique solution, then $G_{t}$ is constant for $t$ large enough. By Proposition 19, the RDE (1.54) has a unique solution if and only if $\nu_{\rm low}=\nu_{\rm upp}$ . Let [math] and $1$ denote the minimal and maximal elements of $S$ . By Lemma 57, $G_{t}(0,\ldots,0)$ and $G_{t}(1,\ldots,1)$ converge as $t\to\infty$ to a.s. limits with laws $\nu_{\rm low}$ and $\nu_{\rm upp}$ , respectively. Since $\gamma[\omega]$ is monotone for each $\omega$ , the maps $G_{t}$ are monotone, and hence

[TABLE]

for all $x\in S^{\nabla{\mathbb{S}}_{t}}$ . Since $S$ is finite, if the laws of the left- and right-hand sides of (6.20) converge to the same limit, then $\lim_{t\to\infty}{\mathbb{P}}[G_{t}(0,\ldots,0)=G_{t}(1,\ldots,1)]=1$ , proving that $G_{t}$ is constant for $t$ large enough.

6.3 Duality

In this subsection, we prove Lemma 24. For a start, we will generalize quite a bit and assume that $S$ is a finite partially ordered set and that $\gamma[\omega]:S^{\kappa(\omega)}\to S$ is monotone for all $\omega\in\Omega$ , where $S^{\kappa(\omega)}$ is equipped with the product partial order. As in Subsection 5.1, we let ${\cal T}$ denote the set of all finite subtrees ${\mathbb{U}}\subset{\mathbb{T}}$ such that either $\varnothing\in{\mathbb{U}}$ or ${\mathbb{U}}=\emptyset$ . For each ${\mathbb{U}}\in{\cal T}$ , we define $G_{\mathbb{U}}:S^{\nabla{\mathbb{U}}}\to S$ as in (1.46), where $\nabla{\mathbb{U}}:=\{\varnothing\}$ if ${\mathbb{U}}=\emptyset$ .

For any ${\mathbb{U}}\in{\cal T}$ , we let $\Sigma_{\mathbb{U}}$ denote the set of all $(y_{\mathbf{i}})_{\mathbf{i}\in{\mathbb{U}}\cup\nabla{\mathbb{U}}}$ that satisfy

[TABLE]

Lemma 61 (Monotone duality)

For any ${\mathbb{U}}\in{\cal T}$ , $x\in S^{\nabla{\mathbb{U}}}$ , and $z\in S$ , one has $G_{\mathbb{U}}(x)\geq z$ if and only if there exists a $y\in\Sigma_{\mathbb{U}}$ such that $y_{\varnothing}=z$ and $x\geq y$ on $\nabla{\mathbb{U}}$ .

**Proof **Fix $z\in S$ . For each ${\mathbb{U}}\in{\cal T}$ , let us write

[TABLE]

Then we need to show that

[TABLE]

The proof is by induction on the number of elements of ${\mathbb{U}}$ . If ${\mathbb{U}}=\emptyset$ , then $G_{\mathbb{U}}$ is the identity map, $Y_{\mathbb{U}}=\{z\}$ , and the statement is trivial.

We will show that if the statement is true for ${\mathbb{U}}$ and if $\mathbf{j}\in\nabla{\mathbb{U}}$ , then the statement is also true for ${\mathbb{V}}:={\mathbb{U}}\cup\{\mathbf{j}\}$ . Let $x\in S^{\nabla{\mathbb{V}}}$ and inductively define $x_{\mathbf{i}}$ for $\mathbf{i}\in{\mathbb{V}}$ as in (1.45). By the induction hypothesis, $x_{\varnothing}\geq z$ if and only if

[TABLE]

Here $\nabla{\mathbb{V}}=(\nabla{\mathbb{U}}\backslash\{\mathbf{j}\})\cup\{\mathbf{j}1,\ldots,\mathbf{j}\kappa(\bm{\omega}_{\mathbf{j}})\}$ and

[TABLE]

It follows that (6.24) is equivalent to

[TABLE]

which completes the induction step of the proof.

Lemma 62 (Minimal elements)

Assume that for all $\omega\in\Omega$ , there do not exist $z,z^{\prime}\in S$ and minimal elements $y,y^{\prime}$ of $\{y:\gamma[\omega](y)\geq z\}$ resp. $\{y^{\prime}:\gamma[\omega](y^{\prime})\geq z^{\prime}\}$ such that $z\not\leq z^{\prime}$ but $y\leq y^{\prime}$ . Fix $z\in S$ . For any ${\mathbb{U}}\in{\cal T}$ , define $Y_{\mathbb{U}}$ as in (6.22) dependent on $z$ . Then

[TABLE]

**Proof **By Lemma 61,

[TABLE]

In view of this, it suffices to prove that

[TABLE]

The proof is by induction on the number of elements of ${\mathbb{U}}$ . If ${\mathbb{U}}=\emptyset$ , then $\nabla{\mathbb{U}}=\{\varnothing\}$ and $Y_{\mathbb{U}}$ consists of a single element that has $y_{\varnothing}=z$ , so (6.29) is satisfied. Assume that (6.29) holds for ${\mathbb{U}}$ and let ${\mathbb{V}}:={\mathbb{U}}\cup\{\mathbf{i}\}$ for some $\mathbf{i}\in\nabla{\mathbb{U}}$ . Then (6.25) and the assumption of the lemma imply that (6.29) holds for ${\mathbb{V}}$ .

Lemma 63 (Sets with two elements)

Assume that $S=\{0,1\}$ and that $\gamma[\omega](0,\ldots,0)=0$ for all $\omega\in\Omega$ . Then the assumption of Lemma 62 is satisfied.

**Proof **If $z\not\leq z^{\prime}$ then we must have $z=1$ and $z^{\prime}=0$ , so we must show that there do not exist minimal elements $y,y^{\prime}$ of $\{y:\gamma[\omega](y)\geq 1\}$ resp. $\{y^{\prime}:\gamma[\omega](y^{\prime})\geq 0\}$ such that $y\leq y^{\prime}$ . Clearly, $\{y^{\prime}:\gamma[\omega](y^{\prime})\geq 0\}=\{0,1\}^{\kappa(\omega)}$ has only one minimal element, which is the configuration $(0,\ldots,0)\in\{0,1\}^{\kappa(\omega)}$ , so we must show that there does not exist a minimal element $y$ of $\{y:\gamma[\omega](y)\geq 1\}$ such that $y\leq(0,\ldots,0)$ . Equivalently, this says that $\gamma[\omega](0,\ldots,0)\not\geq 1$ which is satisfied since $\gamma[\omega](0,\ldots,0)=0$ .

Lemma 64 (Lower and upper solutions)

Assume that $S$ is a finite partially ordered set that contains minimal and maximal elements, denoted by 0 and 1. Assume that $\gamma[\omega]$ is monotone for all $\omega\in\Omega$ . Then, for all $z\in S$ ,

[TABLE]

**Proof **By Lemma 61, $G_{\mathbb{U}}(1,\ldots,1)\geq z$ if and only if $\Sigma^{z}_{\mathbb{U}}:=\{y\in\Sigma_{\mathbb{U}}:y_{\varnothing}=z\}$ is not empty. If ${\mathbb{U}}\subset{\mathbb{V}}$ , then $\Sigma^{z}_{\mathbb{V}}\neq\emptyset$ implies $\Sigma^{z}_{\mathbb{U}}\neq\emptyset$ , so the events $\{\Sigma^{z}_{{\mathbb{U}}^{(n)}}\neq\emptyset\}$ decrease to a limit. We claim that this is the event $\{\Sigma^{z}_{\mathbb{S}}\neq\emptyset\}$ . Since the restriction of an element $y\in\Sigma^{z}_{\mathbb{S}}$ to ${\mathbb{U}}$ yields an element of $\Sigma^{z}_{\mathbb{U}}$ , it is clear that

[TABLE]

Conversely, if for each $n$ there exists some $y(n)\in\Sigma^{z}_{{\mathbb{U}}^{(n)}}$ , then by the finiteness of $S$ we can select a subsequence of the $y(n)$ that converges pointwise to a limit $y$ . Since $y\in\Sigma^{z}_{\mathbb{S}}$ , this proves the other inclusion. By Lemma 57, it follows that

[TABLE]

By Lemma 61, $G_{\mathbb{U}}(0,\ldots,0)\geq z$ if and only if $\Sigma^{z}_{\mathbb{U}}$ contains an element $y$ such that $y_{\mathbf{i}}=0$ for all $\mathbf{i}\in\nabla{\mathbb{U}}$ . Since for each $\omega\in\Omega$ , the zero configuration $(0,\ldots,0)$ is the unique minimal element of $\{x\in S^{\kappa(\omega)}:\gamma[\omega](x)\geq 0\}$ , we observe that if $y\in\Sigma_{\mathbb{U}}$ satisfies $y_{\mathbf{i}}=0$ for all $\mathbf{i}\in\nabla{\mathbb{U}}$ , then $y$ can uniquely by extended to an element of $\Sigma_{\mathbb{V}}$ for any ${\mathbb{V}}\supset{\mathbb{U}}$ by putting $y_{\mathbf{i}}:=0$ for $\mathbf{i}\in({\mathbb{V}}\cup\nabla{\mathbb{V}})\backslash({\mathbb{U}}\cup\nabla{\mathbb{U}})$ . In view of this, by Lemma 57,

[TABLE]

Proof of Lemma 24 Recall the definition of $\Sigma_{\mathbb{U}}$ in (6.21). We observe that ${\mathbb{O}}$ is an open subtree of ${\mathbb{U}}$ if and only if its indicator function $1_{\mathbb{O}}$ satisfies $1_{\mathbb{O}}\in\Sigma_{\mathbb{U}}$ and $1_{\mathbb{O}}(\varnothing)=1$ . In view of this, (1.98) is just a special case of Lemma 64. Formula (1.99) follows from Lemmas 62 and 63 applied to ${\mathbb{U}}={\mathbb{S}}_{t}$ .

7 Cooperative branching

In this section we prove all results that deal specifically with our running example of a system with cooperative branching and deaths. In Subsection 7.1, we prove Proposition 12 about the bivariate mean-field equation. In Subsection 7.2, we prove Theorem 17 and Lemma 18 about the higher-level mean-field equation. In Subsection 7.3, finally, we prove Lemmas 22 and 25 which illustrate the concepts of minimal root determining subtrees and open subtrees in the concrete set-up of our example.

7.1 The bivariate mean-field equation

In this subsection we prove Proposition 12. We identify a measure $\mu^{(2)}$ on $\{0,1\}^{2}$ with the function $\mu^{(2)}:\{0,1\}^{2}\to{\mathbb{R}}$ defined as $\mu^{(2)}(0,0):=\mu(\{(0,0)\})$ , $\mu^{(2)}(0,1):=\mu(\{(0,1)\})$ , etc. We parametrize a measure $\mu^{(2)}\in{\cal P}_{\rm sym}(\{0,1\}^{2})$ by the parameters

[TABLE]

We observe that $\mu^{(2)}(0,0)=1-r$ , $\mu^{(2)}(1,0)=\mu^{(2)}(0,1)=r-p$ , and hence $\mu^{(2)}(1,1)=1-(1-r)-2(r-p)=2p-r$ . It follows that $p$ and $r$ determine $\mu^{(2)}$ uniquely and indeed, the map

[TABLE]

is a bijection. Moreover, $\mu^{(2)}$ is concentrated on the diagonal $\{(0,0),(1,1)\}$ if and only if $p=r$ . A function $(\mu^{(2)}_{t})_{t\geq 0}$ with values in ${\cal P}_{\rm sym}(\{0,1\}^{2})$ gives through (7.1) rise to a function $(p_{t},r_{t})_{t\geq 0}$ taking values in ${\cal D}$ .

Lemma 65 (Change of parameters)

A function $(\mu^{(2)}_{t})_{t\geq 0}$ with values in ${\cal P}_{\rm sym}(\{0,1\}^{2})$ solves (1.65) if and only if the associated function $(p_{t},r_{t})_{t\geq 0}$ solves

[TABLE]

**Proof **As noted in Section 1.6, if $\mu^{(2)}_{t}$ solves the bivariate mean-field equation, then its one-dimensional marginals solve the mean-field equation (1.2). Since $\mu^{(2)}_{t}$ is symmetric, both marginals are the same. We denote these by $\mu_{t}$ . Then $p_{t}:=\mu_{t}(\{1\})$ and the equation we find for $p_{t}$ is the same as in (1.36).

We will now obtain the equation for the parameter $r_{t}$ . By definitions (1.12) and (1.13) we have for any $\mu^{(2)}\in{\cal P}(\{0,1\}^{2})$ that $T_{{\mbox{\tt cob}}^{(2)}}(\mu^{(2)})$ is the law of the random variable

[TABLE]

where $(X^{1}_{i},X^{2}_{i})$ $(i=1,2,3)$ are i.i.d. with law $\mu^{(2)}$ . It follows that

[TABLE]

Similar, but simpler considerations give

[TABLE]

Equation (1.65) in the point $(0,0)$ now gives

[TABLE]

which simplifies to the second equation in (7.3).

In view of Lemma 65 and the remarks that precede it, Proposition 12 follows from the following proposition.

Proposition 66 (Bivariate differential equation)

For $\alpha>4$ , the equation (7.3) has four fixed points in the space ${\cal D}$ defined in (7.2), which are of the form

[TABLE]

with $z_{\rm low},z_{\rm mid},z_{\rm upp}$ as in (1.37) and $z_{\rm mid}<r_{\rm mid}$ . Solutions to (7.3) started in ${\cal D}$ converge to one of these fixed points, the domains of attraction being

[TABLE]

respectively. For $\alpha=4$ , the equation (7.3) has two fixed points in the space ${\cal D}$ , which are

[TABLE]

with domains of attraction

[TABLE]

For $\alpha<4$ , $(z_{\rm low},z_{\rm low})$ is the only fixed point in ${\cal D}$ and its domain of attraction is the whole space ${\cal D}$ .

**Proof **In Section 1.3 we have found all fixed points of (7.3) (i) and determined their domains of attraction. It is clear from (7.3) that if $z$ is a fixed point of (7.3) (i), then $(z,z)$ is a fixed point of (7.3), so $(z_{\rm low},z_{\rm low})$ and for $\alpha\geq 4$ also $(z_{\rm mid},z_{\rm mid})$ and $(z_{\rm upp},z_{\rm upp})$ are fixed points of (7.3).

If $\alpha\geq 4$ and $p_{0}<z_{\rm mid}$ or if $\alpha<4$ and $p_{0}$ is arbitrary, then we have seen in Section 1.3 that solutions to (7.3) (i) satisfy $p_{t}\to 0=z_{\rm low}$ as $t\to\infty$ . Since $0\leq r_{t}\leq 2p_{t}$ , it follows that also $r_{t}\to 0$ . This proves the statements of the proposition about the domain of attraction of $(z_{\rm low},z_{\rm low})$ for all values of $\alpha$ .

Let

[TABLE]

denote the drift functions of $p_{t}$ and $r_{t}$ , respectively. We observe that $R_{\alpha,p}(r)\leq P_{\alpha}(r)$ $(p,r\in{\mathbb{R}})$ and $P_{\alpha}(r)<0$ for all $z_{\rm upp}<r\leq 1$ , which implies that

[TABLE]

It follows that solutions of (7.3) satisfy

[TABLE]

If $\alpha>4$ and $p_{0}>z_{\rm mid}$ or if $\alpha=4$ and $p_{0}\geq z_{\rm mid}$ , we have seen in Section 1.3 that solutions to (7.3) (i) satisfy $p_{t}\to z_{\rm upp}$ as $t\to\infty$ . Combining this with (7.14) and the fact that $p_{t}\leq r_{t}$ , we see that $(p_{t},r_{t})\to(z_{\rm upp},z_{\rm upp})$ .

To complete the proof, we must investigate the long-time behavior of solutions of (7.3) when $\alpha>4$ and $p_{0}=z_{\rm mid}$ . In this case $p_{t}=z_{\rm mid}$ for all $t\geq 0$ and $r_{t}$ takes values in $[z_{\rm mid},2z_{\rm mid}]$ and solves the differential equation

[TABLE]

It is clear $r_{t}=z_{\rm mid}$ for all $t\geq 0$ is a solution. Since $z_{\rm mid}<1/2$ , in view of (7.2), we must prove that all solutions with $z_{\rm mid}<r_{0}\leq 2z_{\rm mid}$ converge to a nontrivial fixed point. We write

[TABLE]

Since the first term has a positive slope at $r=z_{\rm mid}$ while the second term has zero slope, we conclude that $R_{\alpha,z_{\rm mid}}$ has a positive slope at $r=z_{\rm mid}$ . Since solutions to (7.3) do not leave the domain ${\cal D}$ , we must have $R_{\alpha,z_{\rm mid}}(2z_{\rm mid})\leq 0$ . Since $R_{\alpha,z_{\rm mid}}(r)=\alpha r^{3}+O(r^{2})$ as $r\to\infty$ , we must have $R_{\alpha,z_{\rm mid}}(r)>0$ for $r$ sufficiently large. These observations imply that the cubic function $R_{\alpha,z_{\rm mid}}$ has three zeros $r_{\rm low}<r_{\rm mid}<r_{\rm up}$ with

[TABLE]

and $R_{\alpha,z_{\rm mid}}>0$ on $(z_{\rm mid},r_{\rm mid})$ and $R_{\alpha,z_{\rm mid}}<0$ on $(r_{\rm mid},r_{\rm upp})$ . It follows that solutions to (7.15) started with $z_{\rm mid}<r_{0}\leq 2z_{\rm mid}$ satisfy $r_{t}\to r_{\rm mid}$ as $t\to\infty$ .

7.2 The higher-level mean-field equation

In this subsection we prove Theorem 17 and Lemma 18. We start with two preparatory lemmas.

Lemma 67 (Convex order and second moments)

Let $S$ be a Polish space $S$ and let $\rho_{1},\rho_{2}\in{\cal P}({\cal P}(S))$ satisfy $\rho_{1}\leq_{\rm cv}\rho_{2}$ and $\rho^{(2)}_{1}=\rho^{(2)}_{2}$ . Then $\rho_{1}=\rho_{2}$ .

**Proof **This follows from [MSS18, Lemma 14].

In the next lemma, we use the notation $\overline{\mu}:={\mathbb{P}}[\delta_{X}\in\,\cdot\,]$ defined in Subsection 1.7.

Lemma 68 (Maximal measure in convex order)

Let $S$ be a Polish space and let $\mu\in{\cal P}(S)$ . Then a measure $\rho\in{\cal P}({\cal P}(S))$ satisfies $\rho=\overline{\mu}$ if and only if $\rho^{(2)}=\overline{\mu}^{(2)}$ .

**Proof **The condition $\rho^{(2)}=\overline{\mu}^{(2)}$ implies that the first moment measure of $\rho$ is $\mu$ . By (1.74), it follows that $\rho\leq_{\rm cv}\overline{\mu}$ , so the statement follows from Lemma 67.

Proof of Theorem 17 It follows from their definition that the measures $\overline{\nu}_{\rm low},\overline{\nu}_{\rm mid},\overline{\nu}_{\rm upp}$ and $\underline{\nu}_{\rm low},\underline{\nu}_{\rm mid},\underline{\nu}_{\rm upp}$ solve the higher-level RDE and their first moment measures are $\nu_{\rm low},\nu_{\rm mid},\nu_{\rm upp}$ , respectively.

By [MSS18, Thm 5], one has $\underline{\nu}=\overline{\nu}$ if and only if the RTP corresponding to $\nu$ is endogenous. By Theorem 10, endogeny is equivalent to bivariate uniqueness, so we obtain from Proposition 12 that $\underline{\nu}_{\rm low}=\overline{\nu}_{\rm low}$ , $\underline{\nu}_{\rm mid}\neq\overline{\nu}_{\rm mid}$ , and $\underline{\nu}_{\rm upp}=\overline{\nu}_{\rm upp}$ .

Since the second moment measures of $\overline{\nu}_{\rm low},\overline{\nu}_{\rm mid},\overline{\nu}_{\rm upp}$ are of the form (1.62), we see that the measures $\overline{\nu}^{(2)}_{\rm low},\overline{\nu}^{(2)}_{\rm mid},\overline{\nu}^{(2)}_{\rm upp}$ from Proposition 12 are indeed the second moment measures of $\overline{\nu}_{\rm low},\overline{\nu}_{\rm mid},\overline{\nu}_{\rm upp}$ .

By Proposition 13, the second moment measure of $\underline{\nu}_{\rm mid}$ solves the bivariate RDE. Since $\underline{\nu}_{\rm mid}\neq\overline{\nu}_{\rm mid}$ , Lemma 68 tells us that the second moment measure of $\underline{\nu}_{\rm mid}$ is different from $\overline{\nu}^{(2)}_{\rm mid}$ . It follows that the measure $\underline{\nu}^{(2)}_{\rm mid}$ from Proposition 12 is indeed the second moment measure of $\underline{\nu}_{\rm mid}$ .

Let $(\rho_{t})_{t\geq 0}$ be a solution to the higher-level mean-field equation. Assume that $\alpha>4$ . Then Propositions 12 and 13 tell us that $\rho^{(2)}_{t}$ converges to one of the fixed points $\overline{\nu}^{(2)}_{\rm low},\underline{\nu}^{(2)}_{\rm mid},\overline{\nu}^{(2)}_{\rm mid},\overline{\nu}^{(2)}_{\rm upp}$ , depending on whether

[TABLE]

By Lemma 68, these four cases correspond exactly to the four domains of attraction in (1.82). To prove that in fact $\rho_{t}$ converges to $\overline{\nu}_{\rm low},\underline{\nu}_{\rm mid},\overline{\nu}_{\rm mid}$ , or $\overline{\nu}_{\rm upp}$ , respectively, in each of these cases, by the compactness of ${\cal P}({\cal P}(\{0,1\}))$ , it suffices to prove that if $\rho_{t_{n}}\Rightarrow\rho_{\ast}$ along a sequence of times $t_{n}\to\infty$ , then $\rho_{\ast}$ is the right limit point. In the cases (i), (iii) and (iv) this is clear from Lemma 68.

To prove the statement also in case (ii), let $(\rho^{\prime}_{t})_{t\geq 0}$ be the solution to the higher-level mean-field equation started in $\rho^{\prime}_{0}=\delta_{\nu_{\rm mid}}$ . Then (1.74) and Proposition 15 tells us that $\rho^{\prime}_{t}\leq_{\rm cv}\rho_{t}$ for all $t\geq 0$ and $\rho^{\prime}_{t}\Rightarrow\underline{\nu}_{\rm mid}$ . Taking the limit $t_{n}\to\infty$ , using condition (i) of Theorem 14, we conclude that $\underline{\nu}_{\rm mid}\leq_{\rm cv}\rho_{\ast}$ . Since moreover $\underline{\nu}^{(2)}_{\rm mid}=\rho^{(2)}_{\ast}$ , we can apply Lemma 67 to conclude that $\underline{\nu}_{\rm mid}=\rho_{\ast}$ .

This completes the proof for $\alpha>4$ . The cases $\alpha=4$ and $\alpha<4$ are similar, but simpler.

Proof of Lemma 18 We note that if $\eta_{1},\eta_{2},\eta_{3}\in[0,1]$ , then

[TABLE]

Combining this with (1.13) and Proposition 13, we see that if $\rho$ solves the higher-level RDE (1.73), then

[TABLE]

must all solve the RDE (1.54). Applying this to $\underline{\nu}_{\rm mid}$ which has $\int\underline{\nu}_{\rm mid}(\mathrm{d}\eta)\eta=z_{\rm mid}$ , we see that

[TABLE]

We first observe that since $\int\underline{\nu}_{\rm mid}(\mathrm{d}\eta)\eta=z_{\rm mid}$ , we can have $\underline{\nu}_{\rm mid}(\{1\})\geq z_{\rm mid}$ only if $\underline{\nu}_{\rm mid}=\overline{\nu}_{\rm mid}$ , which we know is not the case, so we conclude that $\underline{\nu}_{\rm mid}(\{1\})=0$ . If $\underline{\nu}_{\rm mid}((0,1])\leq z_{\rm mid}$ then $\int\underline{\nu}_{\rm mid}(\mathrm{d}\eta)\eta=z_{\rm mid}$ forces $\underline{\nu}_{\rm mid}(\{1\})=z_{\rm mid}$ , which we know is not the case, so we conclude that $\underline{\nu}_{\rm mid}((0,1])=z_{\rm upp}$ and hence $\underline{\nu}_{\rm mid}(\{0\})=1-z_{\rm upp}=z_{\rm mid}$ , where the last equality follows from (1.37).

To calculate $\int\!\underline{\nu}_{\rm mid}(\mathrm{d}\eta)\,\eta^{2}$ , we use that $1-\underline{\nu}^{(2)}_{\rm mid}(0,0)=r_{\rm mid}$ , where $r_{\rm mid}$ is the second largest solution $r$ of the equation $R_{\alpha,z_{\rm mid}}(r)=0$ , with $R_{\alpha,z_{\rm mid}}$ defined as in (7.12). The smallest solution of the cubic equation $R_{\alpha,z_{\rm mid}}(r)=0$ is $r=z_{\rm mid}$ . Dividing by $(r-z_{\rm mid})$ yields a quadratic equation of which $r_{\rm mid}$ is the smallest solution. Since these are straightforward, but tedious calculations, we omit them.

7.3 Root-determining and open subtrees

In this subsection we prove Lemmas 22 and 25.

Proof of Lemma 22 Since $\kappa(\omega)=3$ if $\gamma[\omega]={\mbox{\tt cob}}$ and $\kappa(\omega)=0$ if $\gamma[\omega]={\mbox{\tt dth}}$ , we see that

[TABLE]

which is $\leq 1$ if and only if $\alpha\leq{\textstyle\frac{{1}}{{2}}}$ . At the end of Subsection 1.3, we have seen that in our example the RDE (1.54) has a unique solution if and only if $\alpha<4$ . By Lemma 23 this is equivalent to condition (ii) of Lemma 21. Since $S=\{0,1\}$ , Lemma 21 tells is that in our example, conditions (i)–(iii) are equivalent.

We claim that a finite subtree ${\mathbb{U}}\subset{\mathbb{S}}$ satisfying (1.94) is uniquely determined and in fact $x\in\Xi_{\mathbb{U}}$ implies $x_{\mathbf{i}}=0$ for all $\mathbf{i}\in{\mathbb{U}}$ . To prove this, let $A=\{\mathbf{i}\in{\mathbb{U}}:x_{\mathbf{i}}=0\ \forall x\in\Xi_{\mathbb{U}}\}$ . Since ${\mathbb{U}}$ is finite, if ${\mathbb{U}}\backslash A$ is not empty then we can find some $\mathbf{i}\in{\mathbb{U}}\backslash A$ such that $\mathbf{i}j\not\in A$ for $j=1,2,3$ . (Here we take ${\mathbb{T}}$ to be the set of all words made from the alphabet $\{1,2,3\}$ .) If $\gamma[\bm{\omega}_{\mathbf{i}}]={\mbox{\tt dth}}$ , then $x_{\mathbf{i}}=0$ for all $x\in\Xi_{\mathbb{U}}$ which contradicts the fact that $\mathbf{i}\in{\mathbb{U}}\backslash A$ . But if $\gamma[\bm{\omega}_{\mathbf{i}}]={\mbox{\tt cob}}$ , then (1.94) and the fact that $\mathbf{i}j\not\in A$ for $j=1,2,3$ again imply $x_{\mathbf{i}}=0$ for all $x\in\Xi_{\mathbb{U}}$ , so we see that ${\mathbb{U}}\backslash A$ must be empty. In particular, this shows that ${\mathbb{U}}$ is root determining.

To see that ${\mathbb{U}}$ is a minimal root determining subtree, assume that ${\mathbb{V}}\subset{\mathbb{U}}$ is a smaller one. Then there must be be some $\mathbf{i}\in{\mathbb{V}}$ such that $\gamma[\bm{\omega}_{\mathbf{i}}]={\mbox{\tt cob}}$ and either $\mathbf{i}1\not\in{\mathbb{V}}$ or ${\mathbb{V}}\cap\{\mathbf{i}2,\mathbf{i}3\}=\emptyset$ . (Here we use that by definition, minimal root determining subtrees contain the root, so ${\mathbb{V}}$ is not empty.) But then either $\mathbf{i}1\in\nabla{\mathbb{V}}$ or $\{\mathbf{i}2,\mathbf{i}3\}\subset\nabla{\mathbb{V}}$ . Define $x\in\Xi_{\mathbb{V}}$ inductively by (1.45) with $x_{\mathbf{j}}=1$ for all $\mathbf{j}\in\nabla{\mathbb{V}}$ . Then $x_{\mathbf{i}}=1$ . Either $\mathbf{i}$ is the root or its predecessor ${\accentset{\leftarrow}{\mathbf{i}}}$ satisfies $x_{\,{\accentset{\leftarrow}{\mathbf{i}}}}=1$ by (1.94), so by induction we see that $x_{\mathbf{i}}=1$ . Since the all-zero configuration is also an element of $\Xi_{\mathbb{V}}$ , this proves that ${\mathbb{V}}$ is not root determining and hence ${\mathbb{U}}$ is minimal.

Proof of Lemma 25 We observe that

[TABLE]

Since the set ${\mbox{\tt dth}}^{-1}(\{1\})=\{y\in\{0,1\}^{0}:{\mbox{\tt dth}}(y)=1\}$ is empty, it has no minimal elements, and hence $Y_{\mbox{\tt dth}}=\emptyset$ . On the other hand, ${\mbox{\tt bth}}(y)=1$ for all $y\in\{0,1\}^{0}$ . In fact, $\{0,1\}^{0}=\{\varnothing\}$ is a set with only one element, the empty word, so ${\mbox{\tt bth}}^{-1}(\{1\})=\{\varnothing\}$ and hence the set of its minimal elements is $Y_{{\mbox{\tt bth}}}=\{\varnothing\}$ . Now (1.97) with the convention that $1_{A_{\mathbf{i}}}:=\varnothing$ if $\kappa(\bm{\omega}_{\mathbf{i}})=0$ says that ${\mathbb{O}}$ is an open subtree of ${\mathbb{U}}$ if and only if:

(i)

$\{j\in\{1,2,3\}:\mathbf{i}j\in{\mathbb{O}}\}=\{1\}$ or $=\{2,3\}$ for each $\mathbf{i}\in{\mathbb{O}}\cap{\mathbb{U}}$ such that $\gamma[\bm{\omega}_{\mathbf{i}}]={\mbox{\tt cob}}$ , 2. (ii)

$\varnothing\in\emptyset$ for each $\mathbf{i}\in{\mathbb{O}}\cap{\mathbb{U}}$ such that $\gamma[\bm{\omega}_{\mathbf{i}}]={\mbox{\tt dth}}$ , 3. (iii)

$\varnothing\in\{\varnothing\}$ for each $\mathbf{i}\in{\mathbb{O}}\cap{\mathbb{U}}$ such that $\gamma[\bm{\omega}_{\mathbf{i}}]={\mbox{\tt bth}}$ ,

which corresponds to the condition in (1.102).

Bibliography31

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[AB 05] D.J. Aldous and A. Bandyopadhyay. A survey of max-type recursive distributional equations. Ann. Appl. Probab. 15(2) (2005), 1047–1110.
2[ADF 18] L. Andreis, P. Dai Pra, and M. Fischer. Mc Kean–Vlasov limit for interacting systems with simultaneous jumps. Stoch. Anal. Appl. (2018), doi: 10.1080/07362994.2018.1486202.
3[Ald 00] D.J. Aldous. The percolation process on a tree where infinite clusters are frozen. Math. Proc. Cambridge Philos. Soc. 128 (2000), 465–477.
4[Als 12] G. Alsmeyer. Random recursive equations and their distributional fixed points. Unpublished manuscript (2012), available from https://www.uni-muenster.de/ Stochastik/lehre/WS 1112/Stoch Rek Gleichungen II/book.pdf
5[BCH 18] E. Baake, F. Cordero, S. Hummel. Lines of descent in the deterministic mutation-selection model with pairwise interaction. Preprint (2018), 41 pages, ar Xiv:1812.00872.
6[Bou 58] N. Bourbaki. Éléments de Mathématique. VIII. Part. 1: Les Structures Fondamentales de l’Analyse. Livre III: Topologie Générale. Chap. 9: Utilisation des Nombres Réels en Topologie Générale. 2iéme éd. Actualités Scientifiques et Industrielles 1045. Hermann & Cie, Paris, 1958.
7[BW 97] E. Baake, T. Wiehe. Bifurcations in haploid and diploid sequence space models. J. Math. Biol. 35 (1997) 321–343.
8[Cho 69] G. Choquet. Lectures on Analysis. Volume I. Integration and Topological Vector Spaces. Benjamin, London, 1969.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Recursive tree processes and the

Abstract

Contents

1 Introduction and main results

1.1 Introduction

1.2 The mean-field equation

Theorem 1 (Mean-field equation)

Proposition 2 (Continuity in total variation norm)

Proposition 3 (Continuity w.r.t. weak convergence)

1.3 The mean-field limit

Lemma 4 (Simplified equation)

Theorem 5 (Mean-field limit)

1.4 A recursive tree representation

Theorem 6 (Recursive tree representation)

1.5 Recursive tree processes

Lemma 7 (Consistency)

Lemma 8 (Recursive Tree Process)

Proposition 9 (Continuous-time RTP)

1.6 Endogeny and bivariate uniqueness

Theorem 10 (Endogeny and nnn-variate uniqueness)

Theorem 11 (Endogeny and the n-variate mean-field equation)

Proposition 12 (Bivariate equation for cooperative branching)

1.7 The higher-level mean-field equation

Proposition 13 (Moment measures)

Theorem 14 (The convex order for laws of random probability measures)

Proposition 15 (Extremal solutions in the convex order)

Proposition 16 (Higher-level RTPs)

Theorem 17 (Higher-level equation for cooperative branching)

Lemma 18 (Nontrivial solution of the higher-level RDE)

1.8 Lower and upper solutions

Proposition 19 (Lower and upper solutions to RDE)

1.9 Conditions for uniqueness

Lemma 20 (Root determining subtrees)

Lemma 21 (Uniquely determined subtrees)

Lemma 22 (The uniqueness regime)

Lemma 23 (Uniqueness for monotone systems)

Lemma 24 (Open subtrees)

Lemma 25 (Systems with cooperative branching, deaths, and births)

2 Discussion

2.1 A Moran model with frequency-dependent selection

2.2 Mean-field limits

2.3 Open problems

2.4 Outline of the proofs

3 The mean-field equation

3.1 Preliminaries

3.2 Uniqueness

Lemma 26 (Lipschitz continuity)

Lemma 27 (Equivalent formulations of the mean-field equation)

Lemma 28 (Uniqueness)

3.3 The stochastic representation

Proposition 29 (Recursive tree representation)

Lemma 30 (Representation with cut-off)

3.4 Continuity in the initial state

Lemma 31 (Continuity of TTT)

Lemma 32 (Continuity in the initial state)

4 Approximation by finite systems

4.1 Main line of the proof

Proposition 33 (State at sampled sites)

Lemma 34 (Tightness in total variation)

4.2 The state at sampled sites

Lemma 35 (Coupling of maps)

Lemma 36 (The genealogy of multiple sites)

4.3 Tightness in total variation

4.4 Convergence to the mean-field equation

Lemma 37 (Expectation of test functions)

Lemma 38 (Convergence in probability)

Lemma 39 (Convergence to a deterministic measure)

Lemma 40 (Continuity argument)

Lemma 41 (Moment argument)

Lemma 42 (Convergence in path space)

Lemma 43 (Weak convergence and convergence in total variation norm)

5 Recursive Tree Processes

5.1 Construction of RTPs

Lemma 44 (RTPs and stopping trees)

Theorem 10 (Endogeny and $n$ -variate uniqueness)

Lemma 31 (Continuity of $T$ )

Lemma 52 (Monotonicity of $T$ )

Example 58 ((ii) $\not\Rightarrow$ (i))

Example 59 ((iii) $\not\Rightarrow$ (ii))

Example 60 ((v) $\not\Rightarrow$ (ii))