Recursive tree processes and the mean-field limit of stochastic flows
Tibor Mach, Anja Sturm, Jan M. Swart

TL;DR
This paper develops a continuous-time theory for recursive tree processes related to mean-field limits of interacting particle systems, illustrating it with a cooperative branching example that is not endogenous.
Contribution
It introduces a continuous-time analogue for recursive tree processes and analyzes their behavior in the mean-field limit for coupled systems.
Findings
Developed a continuous-time recursive tree process theory.
Connected recursive tree processes with mean-field limits of particle systems.
Provided an example of a non-endogenous recursive tree process.
Abstract
Interacting particle systems can often be constructed from a graphical representation, by applying local maps at the times of associated Poisson processes. This leads to a natural coupling of systems started in different initial states. We consider interacting particle systems on the complete graph in the mean-field limit, i.e., as the number of vertices tends to infinity. We are not only interested in the mean-field limit of a single process, but mainly in how several coupled processes behave in the limit. This turns out to be closely related to recursive tree processes as studied by Aldous and Bandyopadyay in discrete time. We here develop an analogue theory for recursive tree processes in continuous time. We illustrate the abstract theory on an example of a particle system with cooperative branching. This yields an interesting new example of a recursive tree process that is not…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Recursive tree processes and the
mean-field limit of stochastic flows
Tibor Mach 111The Czech Academy of Sciences, Institute of Information Theory and Automation, Pod vodárenskou věží 4, 18208 Praha 8, Czech Republic; [email protected]
Anja Sturm 222Institute for Mathematical Stochastics, Georg-August-Universität Göttingen, Goldschmidtstr. 7, 37077 Göttingen, Germany; [email protected]
Jan M. Swart
Abstract
Interacting particle systems can often be constructed from a graphical representation, by applying local maps at the times of associated Poisson processes. This leads to a natural coupling of systems started in different initial states. We consider interacting particle systems on the complete graph in the mean-field limit, i.e., as the number of vertices tends to infinity. We are not only interested in the mean-field limit of a single process, but mainly in how several coupled processes behave in the limit. This turns out to be closely related to recursive tree processes as studied by Aldous and Bandyopadyay in discrete time. We here develop an analogue theory for recursive tree processes in continuous time. We illustrate the abstract theory on an example of a particle system with cooperative branching. This yields an interesting new example of a recursive tree process that is not endogenous.
MSC 2010. Primary: 82C22, Secondary: 60J25, 60J80, 60K35.
Keywords. Mean-field limit, recursive tree process, recursive distributional equation, endogeny, interacting particle systems, cooperative branching.
Acknowledgement. Work sponsored by grant 16-15238S of the Czech Science Foundation (GA CR) and by grant STU 527/1-2 of the German Research Foundation (DFG) within the Priority Programme 1590 “Probabilistic Structures in Evolution”.
Contents
1 Introduction and main results
1.1 Introduction
Let and be Polish spaces, let be a finite measure on with total mass , and let be measurable, where . Let be the operator acting on probability measures on defined as
[TABLE]
where is an -valued random variable with law and are i.i.d. with law . In this paper, we will be interested in the differential equation
[TABLE]
In Theorem 1 below, we will prove existence and uniqueness of solutions to (1.2) under the assumption that there exists a measurable function such that
[TABLE]
depends only on the first coordinates, and
[TABLE]
Our interest in equation (1.2) stems from the fact that, as we will prove in Theorem 5 below, the mean-field limits of a large class of interacting particle systems are described by equations of the form (1.2). In view of this, we call (1.2) a mean-field equation. The analysis of this sort of equations is commonly the first step towards understanding a given interacting particle system. Some illustrative examples of mean-field equations in the literature are [DN97, (1.1)], [NP99, (1.2)], and [FL17, (4)].
In the special case that for all , we observe that , where is the probability kernel on defined as
[TABLE]
In view of this, if are i.i.d. with law and has law , then setting inductively defines a Markov chain with transition kernel , such that has law , where denotes the -th iterate of the map . Also, (1.2) describes the forward evolution of a continuous-time Markov chain where random maps are applied with Poisson rate . A representation of a probability kernel in terms of a random map as in (1.5) is called a random mapping representation.
More generally, when the function is not identically one, Aldous and Bandyopadhyay [AB05] have shown that the iterates of the map from (1.1) can be represented in terms of a Finite Recursive Tree Process (FRTP), which is a generalization of a discrete-time Markov chain where time has a tree-like structure. More precisely, they construct a finite tree of depth where the state of each internal vertex is a random function of the states of its offspring. If the states of the leaves are i.i.d. with law , they show that the state at the root has law . They are especially interested in fixed points of , which generalize the concept of an invariant law of a Markov chain. They show that each such fixed point gives rise to a Recursive Tree Process (RTP), which is a process on an infinite tree where the state of each vertex has law . One can think of such an RTP as a generalization of a stationary backward Markov chain . A fixed point equation of the form is called a Recursive Distributional Equation (RDE). Studying RDEs and their solutions is of independent interest as they appear naturally in many applications, see for example [AB05, Als12].
In the present paper, we develop an analogue theory in continuous time, generalizing the concept of a continuous-time Markov chain to chains where time has a tree-like structure. Let be the semigroup defined by
[TABLE]
In Theorem 6, we show that has a representation similar to (1.1), namely
[TABLE]
where is a countable set, is a random map, and the are i.i.d. with law and independent of . Similar to what we have in (1.3), the map does not depend on all coordinates in but only on a finite subcollection . Here turns out to be a branching process and condition (1.4) (which is not needed in the discrete-time theory) guarantees that the offspring distribution of this branching process has finite mean. Similarly to (1.5), we can view (1.7) as a random mapping representation of the operator in (1.6).
As we have already mentioned, in Theorem 5 below, we prove that the mean-field limits of a large class of interacting particle systems are described by equations of the form (1.2). These interacting particle systems are constructed by applying local maps at the times of associated Poisson processes, which are introduced in detail in Section 1.3.
We are not only interested in the mean-field limit of a single process, but mainly in the mean-field limit of coupled processes that are constructed from the same Poisson processes. For each , a measurable map gives rise to -variate map defined as
[TABLE]
where we denote an element of as with and . Let denote the space of probability measures on a space . Letting denote the -variate map associated with then, in analogy to (1.1),
[TABLE]
defines an -variate map , which as in (1.2) gives rise to an -variate mean-field equation, which describes the mean-field limit of coupled processes.
If is an -valued random variable whose law is a fixed point of , then is a fixed point of that describes perfectly coupled processes. We will be interested in the stability (or instability) of under the -variate mean-field equation. In other words, for our mean-field interacting particle systems, we fix the Poisson processes used in the construction and want to know if small changes in the initial state lead to small (or large) changes in the final state. Aldous and Bandyopadhyay [AB05] define an RTP to be endogenous if the state at the root is a measurable function of the random maps attached to all vertices of the tree. They showed, in some precise sense (see Theorem 10 below), that endogeny is equivalent to stability of . In Theorem 11, we generalize their result to the continuous-time setting.
The -variate map is well-defined even for , and maps the space of all exchangeable probability laws on into itself. Let be a -valued random variable with law , and conditional on , let be i.i.d. with common law . Then the unconditional law of is exchangeable, and by De Finetti, each exchangeable law on is of this form. In view of this, naturally gives rise to a map which is the higher-level map defined in [MSS18], and which analogously to (1.2) gives rise to a higher-level mean-field equation. For any , let denote the set of all with mean . In [MSS18] it is shown that if is a fixed point of , then the corresponding higher-level map has two fixed points and in that are minimal and maximal with respect to the convex order, defined in Theorem 14 below. Moreover, if and only if the RTP corresponding to is endogenous.
We will apply the theory developed here as well as in [MSS18] to the higher-level mean-field equation for a particular interacting particle system with cooperative branching and deaths; see also [SS15, Mac17, BCH18] for several different variants of the model. To formulate this properly, it is useful to introduce some more general notation. Recall that for each , is a map from into . We let
[TABLE]
denote the set of all maps that can be obtained by varying . Here, elements of are measurable maps where may depend on . If , then is defined to be a set with just one element, which we denote by (the empty sequence, which we distinguish notationally from the empty set ). We equip with the final -field for the map and let denote the image of the measure under this map. Then the mean-field equation (1.2) can be rewritten as
[TABLE]
where for any measurable map ,
[TABLE]
In the concrete example that we are interested in, and each have just two elements. Here and are maps defined as
[TABLE]
We choose
[TABLE]
Then the mean-field equation (1.11) takes the form
[TABLE]
which describes the mean-field limit of a particle system with cooperative branching (with rate ) and deaths (with rate 1). We will see that for , (1.11) has two stable fixed points , and an unstable fixed point that separates the domains of attraction of the stable fixed points.
In Theorem 17 below, we find all fixed points of the corresponding higher-level mean-field equation, and determine their domains of attraction. Note that solutions of the higher-level mean-field equation take values in the probability measures on . As mentioned before, each fixed point of the original mean-field equation gives rise to two fixed points of the higher-level mean-field equation, which are minimal and maximal in with respect to the convex order. Moreover, if and only if the RTP corresponding to is endogenous. In our example, we find that the stable fixed points give rise to endogenous RTPs, but the RTP associated with is not endogenous. The higher-level equation has no other fixed points in except for and , of which the former is stable and the latter unstable. Numerical data for the nontrivial fixed point (viewed as a probability measure on ) are plotted in Figure 2.
1.2 The mean-field equation
In this subsection, we collect some basic results about the mean-field equation (1.2) that form the basis for all that follows. We interpret (1.2) in the following sense: letting , we say that a process solves (1.2) if for each bounded measurable function , the function is continuously differentiable and
[TABLE]
Our first result gives sufficient conditions for existence and uniqueness of solutions to (1.2).
Theorem 1 (Mean-field equation)
Let and be Polish spaces, let be a nonzero finite measure on , and let be measurable. Assume that there exists a measurable function such that (1.3) and (1.4) hold. Then the mean-field equation (1.2) has a unique solution for each initial state .
Theorem 1 allows us to define a semigroup of operators as in (1.6). It is often useful to know that solutions to (1.2) are continuous as a function of their initial state. The following proposition gives continuity w.r.t. the total variation norm and moreover shows that if the constant from (1.18) is negative, then the operators form a contraction semigroup.
Proposition 2 (Continuity in total variation norm)
Under the assumptions of Theorem 1, one has
[TABLE]
where
[TABLE]
Continuity w.r.t. weak convergence needs an additional assumption.
Proposition 3 (Continuity w.r.t. weak convergence)
Assume that
[TABLE]
Then the operator in (1.1) and the operators in (1.6) are continuous w.r.t. the topology of weak convergence.
The condition (1.19) is considerably weaker than the condition that is continuous for all . A simple example is , the Lebesgue measure, , and .
1.3 The mean-field limit
In this subsection, we show that equations of the form (1.2) arise as the mean-field limits of a large class of interacting particle systems. In order to be reasonably general, and in particular to allow for systems in which more than one site can change its value at the same time, we will introduce quite a bit of notation that will not be needed anywhere else in Section 1, so impatient readers can just glance at Theorem 5 and the discussion surrounding (1.36) and skip the rest of this subsection.
Let be a Polish space as before, and let . We will be interested in continuous-time Markov processes taking values in , where is large. Denoting an element of by , we will focus on processes with a high degree of symmetry, in the sense that their dynamics are invariant under a permutation of the coordinates. It is instructive, though not necessary for what follows, to view as the vertex set of a complete graph, where all vertices are neighbors of each other. The basic ingredients we will use to describe our processes are:
- (i)
a Polish space equipped with a finite nonzero measure , 2. (ii)
a measurable function ,
as well as, for each and ,
- (iii)
a function , 2. (iv)
a finite set such that \gamma_{i}[\omega](x_{1},\ldots,x_{\lambda(\omega)})=\gamma_{i}[\omega]\big{(}(x_{i})_{i\in K_{i}(\omega)}\big{)} depends only on the coordinates in .
Setting , we assume that the functions
[TABLE]
for each . We let denote the function
[TABLE]
and let denote the cardinality of the set .
The space , measure , and functions and play roles similar, but not quite identical to , and from Subsection 1. We can use , and to define the following mean-field equation:
[TABLE]
The following lemma says that (1.22) is really a mean-field equation of the form we have already seen in (1.2). This is why in subsequent sections we will only work with equations of this form.
Lemma 4 (Simplified equation)
Assume that
[TABLE]
Then equation (1.22) can be cast in the simpler form (1.2) for a suitable choice of , and , where (1.23) (i) guarantees that is a finite measure and (1.23) (ii) implies that (1.4) holds. If
[TABLE]
then moreover (1.19) can be satisfied.
We now use the ingredients , and to define the class of Markov processes we are interested in. We construct these processes by applying local maps, that affect only finitely many coordinates, at the times of associated Poisson processes. In the context of interacting particle systems, such constructions are called graphical representations.
For any we set . We let denote the set of all sequences for which are all different. Note that has elements. We will consider Markov processes with values in that evolve in the following way:
- (i)
At the times of a Poisson process with intensity , an element is chosen according to the probability law . 2. (ii)
If , nothing happens. 3. (iii)
Otherwise, an element is selected according to the uniform distribution on , and the previous values \big{(}X_{i_{1}}(t-),\ldots,X_{i_{\lambda(\omega)}}(t-)\big{)} of at the coordinates are replaced by \big{(}X_{i_{1}}(t),\ldots,X_{i_{\lambda(\omega)}}(t)\big{)}=\vec{\gamma}[\omega]\big{(}X_{i_{1}}(t-),\ldots,X_{i_{\lambda(\omega)}}(t-)\big{)}.
More formally, we can construct our Markov process as follows. For each with , and for each , define a map by
[TABLE]
Let be a Poisson point set on
[TABLE]
with intensity
[TABLE]
Since is a finite measure, the set is a.s. finite for each , so we can order its elements as
[TABLE]
and use this to define
[TABLE]
In words, is a list of triples . Here represents some external input that tells us that we need to apply the map . The coordinates where and the time when this map needs to be applied are given by and , respectively. It is easy to see that the random maps form a stochastic flow, i.e.,
[TABLE]
where denotes the identity map. Moreover has independent increments in the sense that
[TABLE]
for each . It is well-known (see, e.g., [SS18, Lemma 1]) that if is an -valued random variable, independent of the Poisson set , then setting
[TABLE]
defines a Markov process with values in . Note that has piecewise constant sample paths, which are right-continuous because of the way we have defined .
We now formulate our result about the mean-field limit of Markov processes as defined in (1.32). For any , we define an empirical measure on by
[TABLE]
Below, denotes the product measure of copies of . The expectation of a random measure on a Polish space is defined in the usual way, i.e., is the deterministic measure defined by for any bounded measurable .
Theorem 5 (Mean-field limit)
Let be a Polish space, let , and be as above, and assume (1.23). For each , let be Markov processes with state space as defined in (1.32), and let denote their associated empirical measures. Let be any metric on that generates the topology of weak convergence. Fix some (deterministic) and assume that (at least) one of the following two conditions is satisfied.
- (i)
\displaystyle{\mathbb{P}}\big{[}d(\mu^{N}_{0},\mu_{0})\geq\varepsilon]\underset{{N}\to\infty}{\longrightarrow}0* for all , and (1.24) holds.* 2. (ii)
\big{\|}{\mathbb{E}}[(\mu^{N}_{0})^{\otimes n}]-\mu_{0}^{\otimes n}\big{\|}\underset{{N}\to\infty}{\longrightarrow}0* for all , where denotes the total variation norm.*
Then
[TABLE]
where is the unique solution to the mean-field equation (1.22) with initial state .
Condition (ii) is in particular satisfied if are i.i.d. with common law . Note that in (1.34), we rescale time by a factor .
It is instructive to demonstrate the general set-up on our concrete example of a particle system with cooperative branching and deaths. As before, we have . We choose for a set with just two elements, say , and we set and . We let , , and define and by
[TABLE]
Then the particle system in (1.32) has the following description. Let us say that a site is occupied at time if . Then, with rate , three sites are selected at random. If the sites and are both occupied, then the particles at these sites cooperate to produce a third particle at , provided this site is empty. In addition, with rate 1, a site is selected at random, and any particle that is present there dies.
It is not hard to see that for our choice of , and , the mean-field equation (1.22) simplifies to (1.15), Note that since and are the identity map, they drop out of (1.22), so only and remain. Since regardless of the value of , we can choose for the empty set and view as a function .
Solutions of (1.15) take values in the probability measures on , which are uniquely characterized by their value at 1. Rewriting (1.15) in terms of yields the equation
[TABLE]
This equation can also be found in [Nob92, (1.11)], [Neu94, (1.2)], [BW97, (3.1)], [FL17, (4)], and [BCH18, (2.1)]. It is not hard to check that for , the only fixed point of (1.36) is , while for , there are additional fixed points
[TABLE]
If , then solutions to (1.36) converge to regardless of the initial state. On the other hand, for , solutions to (1.36) with converge to the upper fixed point while solutions to (1.36) with converge to the lower fixed point . In particular, if , then and are stable fixed points while is an unstable fixed point separating the domains of attraction of and .
1.4 A recursive tree representation
In this subsection we formally introduce Finite Recursive Tree Processes (FRTPs) and state the random mapping representation of solutions to the mean-field equation (1.2) anticipated in (1.7).
For , let denote the space of all finite words made up from the alphabet , and define similarly, using the alphabet . If with and , then we define the concatenation by . We denote the length of a word by and let denote the word of length zero. We view as a tree with root , where each vertex has children , and each vertex except the root has precisely one ancestor . For each rooted subtree of , i.e., a subtree that contains , we let denote the boundary of relative to . We write
[TABLE]
and use the convention , so that (1.38) holds also for .
We return to the set-up of Subsection 1.1, i.e., and are Polish spaces, is a nonzero finite measure on , and and are measurable functions such that (1.3) holds. We fix some such that for all and set . Let be an i.i.d. collection of -valued r.v.’s with common law . Fix and assume that
[TABLE]
Then it is easy to see that the law of is given by , where is the -th iterate of the operator in (1.1). We call the collection of random variables
[TABLE]
a Finite Recursive Tree Process (FRTP). We can think of as a generalization of a Markov chain, where time has a tree-like structure.
We now aim to give a similar representation of the semigroup from (1.6). To do this, we let be i.i.d. exponentially distributed random variables with mean . We interpret as the lifetime of the individual with index and let
[TABLE]
denote the times when the individual is born and dies, respectively. Then
[TABLE]
are the (random) subtrees of consisting of all individuals that have died before time , resp. are alive at time . If the function from (1.3) is bounded, then we can choose with . Now it is easy to check that is a continuous-time branching process where each particle is with rate replaced by new particles. In particular, is a.s. finite for each . On the other hand, when is unbounded, we need to choose , and this has the consequence that is a.s. infinite for each . Nevertheless, under the assumption (1.4), it turns out that only a finite subtree of is relevant for the state at the root , as we explain now.
Let be the random subtree of defined as
[TABLE]
and for each subtree , let denote the outer boundary of relative to , where again we use the convention that if is the empty set. Then, under condition (1.4),
[TABLE]
are a.s. finite for all . Indeed, is a branching process where for each individual , with Poisson rate , an element is selected and is replaced by new individuals . The condition on the rates (1.4) guarantees that this branching process has finite mean and in particular does not explode, so that is a.s. a finite subtree of .
Let be i.i.d. with common law , independent of the lifetimes . For any finite rooted subtree and for each , we can inductively define for by
[TABLE]
Then the value we obtain at the root is a function of . Let us denote this function by , i.e.,
[TABLE]
We can think of as the “concatenation” of the maps . We will in particular be interested in the random maps
[TABLE]
with as in (1.44). For our running example of a system with cooperative branching and deaths, these definitions are illustrated in Figure 1.
Let , defined as
[TABLE]
be the natural filtration associated with our evolving marked tree, that contains information about which individuals are alive at time , as well as the random elements and lifetimes associated with all individuals that have died by time . In particular, is measurable w.r.t. . The following theorem is a precise formulation of the random mapping representation of solutions of the mean-field equation (1.2), anticipated in (1.7).
Theorem 6 (Recursive tree representation)
Let and be Polish spaces, let be a nonzero finite measure on , and let and be measurable functions satisfying (1.3) and (1.4). Let be i.i.d. with common law and let be an independent i.i.d. collection of exponentially distributed random variables with mean . Fix and let and be defined as in (1.47) and (1.48). Conditional on , let be i.i.d. -valued random variables with common law . Then
[TABLE]
where is defined in (1.6).
Recalling the definition of , we can also formulate Theorem 6 as follows. With as above, fix and let be random variables such that
[TABLE]
Then (1.49) says that the state at the root has law . This is a continuous-time analogue of the FRTP (1.39).
In our proofs, we will first prove Theorem 6 and then use this to prove Theorem 5 about the mean-field limit of interacting particle systems. Recall that these particle systems are constructed from a stochastic flow as in (1.32). To find the empirical measure of , we pick a site at random and ask for its type which via is a function of the initial state . When is large, does not depend on all coordinates but only on a random subset of them, and indeed one can show that the map that gives as a function of these coordinates approximates the map from Theorem 6, in an appropriate sense. The heuristics behind this are explained in some more detail in Subsection 4.1 below.
Remark Another way to write (1.49) is
[TABLE]
where is defined as in (1.12) for the random map and is a solution to (1.2). One can check that is a Markov process. Let us informally denote this process by and its state space by . Then equation (1.49) can be understood as a (generalized) duality relationship between and with (generalized) duality function given by
[TABLE]
With this definition, using the fact that is the identity map, (1.51) reads
[TABLE]
and we can obtain a family of usual (real-valued) dualities by integrating against a test function .
1.5 Recursive tree processes
Recall the definition of the operator in (1.1) and the semigroup in (1.6). It is clear from (1.2) that for a measure , the following two conditions are equivalent:
[TABLE]
We call such a measure a fixed point of the mean-field equation (1.2). Condition (ii) is equivalent to saying that a random variable with law satisfies
[TABLE]
where denotes equality in distribution, are i.i.d. copies of , and is an independent -valued random variable with law . Equations of this type are called Recursive Distributional Equations (RDEs).
FRTPs as in (1.39) are consistent in the sense that if are as in (1.39), then for any ,
[TABLE]
The following lemma states a similar consistency property in the continuous-time setting.
Lemma 7 (Consistency)
*Fix and let be as in (1.50).
Then, for each ,*
[TABLE]
where is defined in (1.6).
Using the consistency relation (1.56) and Kolmogorov’s extension theorem, it is not hard to see that if solves the RDE (1.54), then it is possible to define a stationary recursive process on an infinite tree such that each vertex has law . This was already observed in [AB05]. The following lemma is a slight reformulation of their observation.
Lemma 8 (Recursive Tree Process)
Let be a solution to the RDE (1.54). Then there exists a collection of random variables whose joint law is uniquely characterized by the following requirements.
[TABLE]
We call a collection of random variables as in Lemma 8 the Recursive Tree Process (RTP) corresponding to the map and the solution of the RDE (1.54). We can view such an RTP as a generalization of a stationary backward Markov chain. For most purposes, we will only need the random variables with , the random subtree defined in (1.43). The following proposition shows that by adding independent exponential lifetimes to an RTP, we obtain a stationary version of (1.57).
Proposition 9 (Continuous-time RTP)
Let be an RTP corresponding to a solution of the RDE (1.54), and let be an independent i.i.d. collection of exponentially distributed random variables with mean . Then, for each ,
[TABLE]
At the end of Subsection 1.3 we have seen that in our example of a system with cooperative branching, the RDE (1.54) has three solutions when the branching rate satisfies , two solutions for , and only one solution for . For , the solutions to the RDE are , and , where we let denote the probability measure on with mean as defined around (1.37). By Lemma 8, each of these solutions to the RDE defines an RTP.
1.6 Endogeny and bivariate uniqueness
In [AB05, Def 7], an RTP corresponding to a solution of the RDE (1.54) is called endogenous if is a.s. measurable w.r.t. the -field generated by the random variables . In Lemma 46 below, we will show that this is equivalent to being a.s. measurable w.r.t. the -field generated by the random variables and , where is the random tree defined in (1.43). Aldous and Bandyopadhyay have shown that endogeny is equivalent to bivariate uniqueness, which we now explain.
Let denote the space of probability measures on that are symmetric with respect to permutations of the coordinates. Let denote the projection on the -th coordinate, i.e., , and let denote the -th marginal of a measure . For any , we define
[TABLE]
to be the set of probability measures on whose one-dimensional marginals are all equal to , and we denote . Finally, we define a “diagonal” set
[TABLE]
and given a measure , we let denote the unique element of , i.e.,
[TABLE]
Recall the definition of the -variate map in (1.9). The following theorem has been proved in [MSS18, Thm 1], and in a slightly weaker form in [AB05, Thm 11]. Below, denotes weak convergence of probability measures.
Theorem 10 (Endogeny and -variate uniqueness)
Let be a solution of the RDE (1.54). Then the following statements are equivalent.
- (i)
The RTP corresponding to is endogenous. 2. (ii)
* for all and .* 3. (iii)
* is the only fixed point of in the space .*
We remark that bivariate uniqueness as introduced in [AB05] refers to being the only fixed point of in the space . The equivalences in the above theorem tells us that bivariate uniqueness already follows from the weaker condition (iii) since it implies (ii), which implies n-variate uniqueness for any .
We will prove a continuous-time extension of Theorem 10, relating endogeny to solutions of the -variate mean-field equation
[TABLE]
where we have replaced in (1.2) by and we write to remind ourselves that this is a measure on , rather than on .
This equation has the following interpretation. As in Subsection 1.3, let be a stochastic flow on constructed from a Poisson point set . Let be a random variable with values in , independent of . Then setting
[TABLE]
defines a Markov process that consists of Markov processes with initial states that are coupled in such a way that they are constructed using the same stochastic flow. Applying Theorem 5 to this -variate Markov process, we see that the mean-field equation for the -variate process takes the form (1.63).
We note that if solves the -variate mean-field equation, then any -dimensional marginal of solves the -variate mean-field equation. Also, solutions to (1.63) started in an initial condition satisfy for all . Finally, it is easy to see that implies for all .
We now formulate a continuous-time extension of Theorem 10. Note that in view of (1.54), a measure is a fixed point of the bivariate mean-field equation (i.e., (1.63) with ) if and only if it is a fixed point of . Therefore, the equivalence of points (i) and (iii) from Theorem 10 immediately implies an analogue statement in the continuous-time setting.
Theorem 11 (Endogeny and the n-variate mean-field equation)
Under the assumptions of Theorem 10, the following conditions are equivalent.
- (i)
The RTP corresponding to is endogenous. 2. (ii)
For any and , the solution to the -variate equation (1.63) started in satisfies .
Theorem 11 motivates us to study the bivariate mean-field equation in our example of a particle system with cooperative branching. Recall that in this example, with cob and dth as in (1.13), and is defined in (1.14). In line with (1.15) we write the bivariate mean-field equation as
[TABLE]
For simplicity, we restrict ourselves to symmetric solutions, i.e., solutions that take values in . For any probability measure , we let denote its one-dimensional marginals, which are equal by symmetry. We let denote the probability measures on with mean as defined around (1.37).
Proposition 12 (Bivariate equation for cooperative branching)
For , the bivariate mean-field equation (1.65) has precisely four fixed points in , namely
[TABLE]
which are uniquely characterized by their respective marginals , as well as the fact that , and are concentrated on , but is not.
For any , the solution to (1.65) started in converges as to one of the fixed points in (1.66), the respective domains of attraction being
[TABLE]
For , there are two fixed points and with respective domains of attraction
[TABLE]
while for all solutions converge to .
Combining Proposition 12 with Theorem 11, we see that the RTPs corresponding to and are endogenous, but for , the RTP corresponding to is not. As is clear from [AB05, Table 1], few examples of nonendogenous RTPs were known at the time. Contrary to what is stated in [AB05, Table 1], frozen percolation is now generally conjectured to be nonendogenous, but until recently few “natural” examples of nonendogenous RTPs have appeared in the literature. In fact, the RTP corresponding to seems to be one of the simplest nontrivial examples of a nonendogenous RTP discovered so far. Another nice class of nonendogenous RTPs has recently been described in [MS18].
1.7 The higher-level mean-field equation
Following [MSS18, formula (1.1)], if is a Polish space and is a measurable map, then we define a measurable map by
[TABLE]
Note that in this notation, the map from (1.12) is given by . As in [MSS18, formula (4.2)], we define a higher-level map by
[TABLE]
where is an -valued random variable with law and are i.i.d. -valued random variables with law . Iterates of the map have been studied in [MSS18, Section 4]. We will be interested in the higher-level mean-field equation
[TABLE]
A measure is the law of a random probability measure on . We denote the -th moment measure of such a random measure by
[TABLE]
(Here denotes the expectation of a random measure; see the remark above Theorem 5.) Our notation for moment measures is on purpose similar to our earlier notation for solutions to the -variate equation, because of the following proposition.
Proposition 13 (Moment measures)
If solves the higher-level mean-field equation (1.71), then its -th moment measures solve the -variate equation (1.63).
Similarly to Proposition 13, it has been shown in [MSS18, Lemma 2] that , and this formula holds even for . In view of this, as discussed in Subsection 1.1, the higher-level map is effectively equivalent to the -variate map . It follows from Proposition 13 that if solves the higher-level RDE
[TABLE]
then its -th moment measures solve the -variate RDE , with as in (1.9).
If is an -valued random variable defined on some probability space and is a sub--field, then is a random probability measure333Here we use that since is Polish, regular versions of conditional expectations exist. on . As a consequence, the law of is an element of . In the following theorem, which is based on [Str65, Thm 2] and which in its present form we cite from [MSS18, Thm 13], we use the fact that each Polish space has a metrizable compactification [Bou58, §6 No. 1, Theorem 1]. Moreover, we naturally identify with the space of all probability measures on that are concentrated on .
Theorem 14 (The convex order for laws of random probability measures)
Let be a Polish space, let be a metrizable compactification of , and let {\cal C}_{\rm cv}\big{(}{\cal P}(\overline{S})\big{)} denote the space of all convex continuous functions . Then, for , the following statements are equivalent.
- (i)
* for all \phi\in{\cal C}_{\rm cv}\big{(}{\cal P}(\overline{S})\big{)}.* 2. (ii)
There exists an -valued random variable defined on some probability space and sub--fields such that \displaystyle\rho_{i}={\mathbb{P}}\big{[}{\mathbb{P}}[X\in\,\cdot\,|{\cal H}_{i}]\in\,\cdot\,\big{]} .
If satisfy the equivalent conditions of Theorem 14, then we say that they are ordered in the convex order and denote this as . It follows from [MSS18, Lemma 15] that is a partial order; in particular, and imply .
Recall that in Subsection 1.1, we defined , which is \big{\{}\rho\in{\cal P}({\cal P}(S)):\rho^{(1)}=\mu\big{\}}. We define by , where has law . It is easy to see that the -th moment measures of are given by (1.62), so our present notation is consistent with earlier notation introduced there. By [MSS18, formula (4.7)], the measures are the extremal elements of w.r.t. the convex order, i.e.,
[TABLE]
The following proposition is a continuous-time version of [MSS18, Prop 3].
Proposition 15 (Extremal solutions in the convex order)
If are solutions to the higher-level mean-field equation (1.71) such that , then for all . If solves the RDE (1.54), then solves the higher-level RDE (1.73) and there exists a solution of (1.73) such that
[TABLE]
Here denotes weak convergence of measures on , equipped with the topology of weak convergence. Any solution to the higher-level RDE (1.73) satisfies
[TABLE]
The following result, which we cite from [MSS18, Prop. 4], describes the higher-level RTPs associated with the solutions and of the higher-level RDE.
Proposition 16 (Higher-level RTPs)
Let be a solution of the RDE (1.54) and let and as in (1.76) be the corresponding minimal and maximal solutions to the higher-level RDE, with respect to the convex order. Let be an RTP corresponding to and and set
[TABLE]
Then is an RTP corresponding to and . Also, is an RTP corresponding to and .
Proposition 16 gives a more concrete interpretation of the solutions and to the higher-level RDE from (1.76). Indeed, if is an RTP corresponding to , then
[TABLE]
which corresponds to “perfect knowledge” about the state of the root, while
[TABLE]
corresponds to the knowledge about that is contained in the random variables . Since is a measurable function of if and only if its conditional law given equals , it follows from (1.78) and (1.79) that the RTP corresponding to is endogenous if and only if .
It is instructive to demonstrate the general theory on our concrete example of a system with cooperative branching and deaths. Recall that for , the mean-field equation (1.15) has three fixed points . We denote the corresponding minimal and maximal solutions to the higher-level RDE in the sense of (1.76) by and . The following theorem lifts the results from Proposition 12 about the bivariate equation to a higher level. Indeed, using the theorem below, it is easy to see that the measures and from Proposition 12 are in fact the second moment measures of the measures and .
Theorem 17 (Higher-level equation for cooperative branching)
Let , and denote the fixed points of the mean-field equation (1.15) defined above Proposition 12. Then we have for the corresponding minimal and maximal solutions to the higher-level RDE that
[TABLE]
For , the higher-level RDE (1.73) has four solutions, namely
[TABLE]
Any solution to the higher-level mean-field equation (1.71) converges as to one of the fixed points in (1.81), the respective domains of attraction being
[TABLE]
For , there are two fixed points and with respective domains of attraction
[TABLE]
while for all solutions converge to .
Since a probability measure is uniquely characterized by , there is a natural identification . Let and denote the higher-level maps corresponding to , which using the identification we view as maps and . One can check that
[TABLE]
Identifying , we can identify the measures , and with probability laws on . Letting denote a random variable with law , the higher-level RDE, written in the form (1.55), then reads
[TABLE]
where are independent copies of and is an independent Bernoulli random variable with . Theorem 17 says that for , this equation has four solutions. Three “trivial” solutions that correspond to Bernoulli with parameters
[TABLE]
and a “nontrivial” solution for which . In view of Proposition 16, we can interpret this nontrivial solution (viewed as a probability law on ) as
[TABLE]
where is the RTP corresponding to . The following lemma summarizes some elementary facts about the law . We note that by solving the -variate RDE for , one should in principle be able to calculate higher moments of , although the formulas quickly become unwieldy.
Lemma 18 (Nontrivial solution of the higher-level RDE)
Let and let be a random variable with law . Then
[TABLE]
Moreover,
[TABLE]
It is not too hard to obtain numerical data for , see Figure 2. These data suggest that apart from the atom in [math], the measure has a smooth density with respect to the Lebesgue measure, but we have no proof for this. We have tried to find an explicit formula for the density but have not been successful.
1.8 Lower and upper solutions
In this and the next subsection we collect a few further results on endogeny and the uniqueness of solutions to RDEs. In the present subsection, we show that the endogeny of the RTPs corresponding to and follows from a general principle, discovered in [AB05], that says that RDEs that are defined by monotone maps always have a minimal and maximal solution with respect to the stochastic order, and that the RTPs corresponding to these solutions are always endogenous.
Let be a compact metrizable space that is equipped with a partial order that is closed in the sense that
[TABLE]
is a closed subset of , equipped with the product topology. Recall that a function from one partially ordered space into another is monotone if implies , and a subset of a partially ordered space is increasing if implies . It is known that for two probability measures , the following statements are equivalent:
- (i)
for all closed increasing . 2. (ii)
for all bounded continuous monotone . 3. (iii)
Two random variables with laws can be coupled such that a.s.
The equivalence of (ii) and (iii) is proved in [Lig85, Thm II.2.4]. The equivalence of (i) and (iii) holds more generally for Polish spaces, see [KKO77, Thm 1 (ii) and (vi)]. In the general setting of Polish spaces, the implications (iii)(i) and (iii)(ii) are trivial, but the implication (ii)(i) needs the additional assumption of monotone normality, see [HLL18, Prop. 3.6 and 3.11].
If satisfy the above conditions, then one says that they are stochastically ordered, denoted as . This defines a partial order on ; in particular, by Lemma 50 below, implies .
The proposition below is a variant of [AB05, Lemma 15]. As in our usual setting, we assume that and are Polish spaces, is a nonzero finite measure on , and and are measurable functions such that (1.3) and (1.4) hold. If is equipped with a partial order, then we equip with the product partial order. Recall that Proposition 3 gives sufficient conditions for to be continuous w.r.t. the topology of weak convergence.
Proposition 19 (Lower and upper solutions to RDE)
Assume that is compact and equipped with a closed partial order. Assume that has minimal and maximal elements, denoted by [math] and . Assume is monotone for each and that the operator in (1.1) is continuous w.r.t. the topology of weak convergence. Then there exists solutions to the RDE (1.54) that are minimal and maximal with respect to the stochastic order, in the sense that any solution to the RDE (1.54) must satisfy
[TABLE]
where denotes weak convergence. Moreover, if and denote the solutions to the mean-field equation (1.2) with initial states and , then
[TABLE]
Finally, the RTPs corresponding to and are endogenous.
We can view the solutions and to the RDE (1.54) as mean-field versions of the lower and upper invariant laws of monotone particle systems; compare [Lig85, Thm III.2.3].
In our example of a system with cooperative branching, the maps cob and dth are monotone, so Proposition 19 is applicable. Since the measures we called and before are the limits of the solutions of the mean-field equation started in and , our earlier notation agrees with the more general notation of Proposition 19. The endogeny of the RTPs corresponding to and , which before we proved based on an analysis of the bivariate equation, using Proposition 12 and Theorem 10, alternatively follows from Proposition 19.
1.9 Conditions for uniqueness
In the present subsection, we prove some results of varying generality that allow one to conclude that a given RDE has a unique solution. In our example of a system with cooperative branching and deaths, this happens if and only if . We will see that there are some general results that can be applied to prove uniqueness in the whole regime . We also make a connection with a general duality for monotone particle systems described in [SS18]. Although duality plays only a minor role in our paper, the original motivation for the work that led to it was to understand this duality in the mean-field limit.
We return to our usual set-up from Subsection 1.1 with and Polish spaces and and satisfying (1.3) and (1.4). We also recall the random subtrees defined in (1.43) as well as the fact that for any are a.s. finite by (1.4). The tree is the family tree of the branching process . In view of this, by well-known facts about branching processes, is a.s. finite if and only if
[TABLE]
Recall that , where for any finite subtree that contains the root, is the map defined in (1.46). If is a.s. finite, then for sufficiently large and hence is eventually constant.
More generally, if is finite subtree of that contains the root , then we say that is a root determining subtree if the map is constant. Note that this can happen even if . It is easy to see that if and is root determining, then the same is true for . We say that is a minimal root determining subtree if is root determining but there exists no with that is root determining. By our previous remark, it suffices to check this for such that differ from by a single element.
Lemma 20 (Root determining subtrees)
The following conditions are equivalent:
- (i)
There a.s. exists a such that is constant for all . 2. (ii)
* a.s. contains a root determining subtree.* 3. (iii)
* a.s. contains a minimal root determining subtree.*
If is a subtree of , then we denote by the set of all that satisfy (1.45). We say that is uniquely determined if imply . The following lemma is inspired by [AB05, Lemma 14] who showed that (i) implies that the RDE (1.54) has a unique solution and the corresponding RTP is endogenous.
Lemma 21 (Uniquely determined subtrees)
Between the following four conditions, one has the implications (i)(ii)(iii)(iv) and (ii)(v). If is finite, then moreover (iii)(ii), and if , then (ii)(i).
- (i)
* a.s. contains a finite, uniquely determined subtree that contains the root .* 2. (ii)
The equivalent conditions of Lemma 20 are satisfied. 3. (iii)
* is a.s. uniquely determined.* 4. (iv)
The RDE (1.54) has at most one solution and any corresponding RTP is endogenous. 5. (v)
The RDE (1.54) has a solution that is globally attractive in the sense that any solution to (1.2) satisfies , where denotes the total variation norm.
The following lemma illustrates these ideas on our example of a system with cooperative branching and deaths. Below, denotes the cardinality of . See Figure 3 for an example.
Lemma 22 (The uniqueness regime)
Let and , and let be as in (1.14). Then (1.93) is satisfied if and only if , while conditions (i)–(iii) of Lemma 21 are satisfied if and only if . Moreover, a finite subtree is a minimal root determining subtree if and only if
[TABLE]
Lemma 22 shows that in our example of a system with cooperative branching and deaths, the conditions of Lemma 20 are in fact equivalent to uniqueness of solutions to the RDE. As the next lemma shows, this is a consequence of monotonicity.
Lemma 23 (Uniqueness for monotone systems)
Assume that is a finite partially ordered set that contains a minimal and maximal element, and assume that is monotone for each . Then the RDE (1.54) has a unique solution if and only if the equivalent conditions of Lemma 20 are satisfied.
In the remainder of this subsection, we focus on the case that and is monotone for all , which allows us to make a connection to a general duality for monotone particle systems described in [SS18]. Recall that a set is increasing if implies . A minimal element of is an such that implies . If is a nonempty finite set and is a monotone map, then the inverse image is an increasing set. We set
[TABLE]
Then
[TABLE]
These formulas remain true when , provided we define and we let if and if .
Recall from Section 1.4 that is a Markov process. If and is monotone for all , then the random map is monotone for each . In view of this, by (1.96), is uniquely characterized by and hence is a Markov process too. For a system with cooperative branching and deaths, this process has been defined before in [Mac17, Section I.2.1.2]. As explained in more detail there, it can be seen as the mean-field limit of a general dual for monotone particle systems described in [SS18, Section 5.2].
Let , let be monotone for all , and let be a subtree of that contains the root . Borrowing terminology from percolation theory, we say that is a open subtree of if and
[TABLE]
where we use the convention that if .
Lemma 24 (Open subtrees)
Assume that and is monotone for all . Then
[TABLE]
If moreover for each , then
[TABLE]
We note that formula (1.98) can be generalized to more general finite partially ordered sets , see Lemma 64 below. Again, it will be useful to illustrate our definitions on the concrete example of a system with cooperative branching and death. To make the example more interesting, we add a birth map , which is defined similarly to the death map as
[TABLE]
The following lemma describes open subtrees for a system described by the maps ; see Figure 4 for an illustration.
Lemma 25 (Systems with cooperative branching, deaths, and births)
Let , , with
[TABLE]
Let be a subtree of that contains the root and let satisfy . Then is an open subtree of if and only if for all ,
[TABLE]
We can think of open subtrees as a generalization of the open paths from oriented percolation. Outside of a mean-field setting, using ideas from [SS18, Section 5.2], one can characterize the upper invariant law of quite general monotone particle system in terms of “open structures” that in general are neither paths nor trees.
2 Discussion
This section is divided into four subsections. In Subsection 2.1, we discuss the relation of our work to [BCH18], who in parallel to our work have studied Moran models that generalize our running example of a system with cooperative branching and deaths. In Subsection 2.2, we compare our results and methods with the existing literature on mean-field limits. In Subsection 2.3, we state open problems and we conclude in Subsection 2.4 with an outline of the proofs.
2.1 A Moran model with frequency-dependent selection
Let be the branching map defined as
[TABLE]
Consider a system with , , with rates
[TABLE]
, , with , and . If solves the corresponding mean-field equation (1.11), then solves the ODE (compare (1.36))
[TABLE]
This equation has an interpretation in terms of a Moran model describing a fixed population of individuals which can be of two types, 0 and 1, where type 1 is fitter than type 0. The parameter is the frequency dependent selection rate, is the selection rate, is the mutation rate, and are mutation probabilities. The frequency dependent selection is of a type that is especially appropriate to describe an advantageous, (partially) recessive gene in a diploid population.
In parallel to our work, Moran models of this form have been studied by Ellen Baake, Fernando Cordero, and Sebastian Hummel in [BCH18]. A notational difference between their work and the discussion here is that they denote the fitter type by 0, so their [BCH18, formula (2.1)] is our (2.3) rewritten in terms of and with the roles of and reversed. They prove that (2.3) describes the mean-field limit of a class of Moran models [BCH18, Prop. 4.1] and that in the limit , the genealogy of a single individual is described by an Ancestral Selection Graph (ASG) , which in our notation corresponds to
[TABLE]
i.e., this is the random tree with maps attached to its branch points depicted in Figure 1.
The authors of [BCH18] define a duality function which corresponds to the duality function in (1.52) after the identification . (Here we have slightly rephrased things compared to the different conventions in [BCH18], where 0 denotes the fitter type and is the frequency of the unfit type.) In [BCH18, Lemma 4.4], they show that can be calculated by concatenating the higher-level maps with . For example, the equation in [BCH18, Lemma 4.4 (4)] can be rewritten in terms of as with as in (1.84).
In [BCH18, Section 5], it is shown that the ASG can be simplified a lot, while retaining all information necessary to calculate the duality function . This is done in three steps, I, IIa, and IIb.
In the step I, the ASG is pruned. This is a process in which parts of the tree that are irrelevant for the map are cut off. In particular, if the function is constant, then the pruned consists of a single edge ending in one of the maps dth or bth. In the remaining case, the pruned ASG is a finite tree where each branch point is marked with one of the maps cob and bra.
In steps IIa and IIb, the pruned ASG is stratified. In step IIa, the tree structure is changed in such a way that starting at the root, one first sees a ternary tree containing only the map cob, and then at the leaves of this ternary tree, there are attached binary trees containing only the map bra. In step IIb, each binary tree is replaced by an integer which records the number of leaves of the binary tree.
The result of this is a simplified process, the stratified ASG , which contains all necessary information about the ASG in the sense that there exists a function such that [BCH18, Thm 5.13]. In particular, solutions of (2.3) can be represented as ([BCH18, Thm 6.2] (compare (1.51)).
One can now check (compare Lemma 48 below) that solves the higher-level mean-field equation with initial state , where we use the identification . In [BCH18, Thm 6.5], it is observed that is a bounded sub- or supermartingale for each and hence converges to an a.s. limit . In [BCH18, Prop. 6.6], it is proved that if is not an unstable fixed point of (2.3), then is a Bernoulli random variable with parameter .
Our Propositions 15 and 16 imply that if is a fixed point of (2.3), then is a Bernoulli random variable if and only if the RTP corresponding to is endogenous. Thus, [BCH18, Prop. 6.6] implies that for the model in (2.2), RTPs corresponding to stable fixed points are always endogenous. Since all stable fixed points of (2.3) are in fact lower or upper solutions, this alternatively also follows from our Proposition 19.
In the special case and , [BCH18, Prop. 6.6] follows alternatively from our Theorem 17, which completely describes the long-time behaviour of solutions to the higher-level mean-field equation not just for initial states of the form , but for general initial states.
2.2 Mean-field limits
If Markov processes interact in a way that is symmetric under permutations of the coordinates, then it is frequently possible to obtain a nontrivial limit as . Such limits are generally called mean-field limits. In the mean-field limit, the individual processes behave asymptotically independently, but with transition probabilities that depend on the average behavior of all processes. For systems of interacting diffusions, this principle was demonstrated by McKean in his analysis of the Vlasov equation [McK66]. Consequently, mean-field limits are also called McKean-Vlasov limits. There exists an extensive literature on the topic. Most work has focused on interacting diffusions, but jump processes have also been studied [ST85, ADF18]. An elementary introduction to mean-field limits for interacting particle systems is given in [Swa17, Chapter 3].
In a biological setting, well-mixing populations converge in the mean-field limit to the solution of a deterministic ODE. Similarly, spatial populations with strong local mixing can be expected to converge, after an appropriate rescaling, to the solution of a determinstic PDE. For interacting particle systems whose dynamics have an exclusion process component with a large rate, this intuition was made rigorous by De Masi, Ferrari and Lebowitz [DFL86, Thm 2]. They state their theorem only for processes whose state space consists of two points, and only prove the theorem for one particular one-dimensional example, but sketch how the proof should be adapted to the general case. In [DN94, Thm 1], a version of the theorem is stated where can be any finite set; it is claimed that the proof is again the same.
In our running example of a particle system with cooperative branching and deaths, the limiting PDE takes the form
[TABLE]
This PDE was used in [Nob92] to derive asymptotic properties of the associated spatial particle system with strong mixing. We can view (2.5) as a spatial version of the ODE (1.36); in particular, if does not depend on , then solves (1.36).
The intuition behind (2.5), and more general PDEs of this type, is easily explained. In the strong mixing limit, the genealogy of a single site should be described by a branching process as in Figure 1 where in addition, each particle has a position in , which moves according to an independent Brownian motion. Convergence to the PDE should then follow from, on the one hand, convergence of the genealogies to a system of branching Brownian motions with random maps attached to their branching events, and, on the other hand, a representation in the spirit of Theorem 6 of solutions of the PDE (2.5) in terms of such a system of branching Brownian motions.
The proof of [DFL86, Thm 2] is indeed based on this sort of dual approach, although one would wish that they had given a more explicit statement of the stochastic representation of solutions of their general PDE. Our proof of Theorem 5 follows the same strategy, i.e., we first prove the stochastic representation of solutions to the mean-field equation (Theorem 6) and then use this to prove our convergence result (Theorem 5).
2.3 Open problems
In the present paper, we have adapted results from [AB05, MSS18] about discrete-time Recursive Tree Processes and endogeny to the continuous-time setting, and applied our general results on a concrete system with cooperative branching and deaths. Among other things, we proved that for , the RTPs corresponding to and are endogenous but the RTP corresponding to is not. The proof was based on an analysis of the bivariate mean-field equation. Here, it was convenient to be able to analyse a differential equation, as an analysis of the associated discrete-time bivariate evolution would have been possible, but more messy.
Our work leaves a number of questions unanswered, both in the general setting and more specifically for our running example with and as in (1.14). Concerning the latter, we pose the following questions.
- Open Problem 1 Not every measure \mu^{(n)}\in{\cal P}_{\rm sym}\big{(}\{0,1\}^{n}\big{)} is the -th moment measure of a measure \rho\in{\cal P}\big{(}{\cal P}(\{0,1\})\big{)}. Determine all symmmetric solutions of the -variate RDE, for general , and their domains of attraction.
- Open Problem 2 Same as Open Problem 1 but without the symmetry assumption and for general .
- Open Problem 3 Prove that apart from the atom at zero, the law , viewed as a probability law on , has a smooth density with respect to the Lebesgue measure.
- Open Problem 4 Determine the aymptotics of the distribution function of near [math] and .
- Open Problem 5 For the more general model in (2.2), is it true that unstable fixed points of the mean-field equation that separate the domains of attraction of two stable fixed points correspond to nonendogenous RTPs? Is the picture for the higher-level RDE the same?
Partly inspired by our concrete example, we ask the following problems in the general setting.
- Open Problem 6 Can (1.4) be relaxed to allow for branching processes that are nonexplosive but have infinite mean?
- Question 7 Are there general results linking the (in)stability of fixed points of the mean-field equation to (non)endogeny of the related RTP?
- Question 8 In our example, the higher-level RDE has two solutions and with mean , of which the former is stable and the latter is unstable. Is this a general phenomenon in the nonendogenous case? Can one prove nonendogeny of an RTP corresponding to a solution of the RDE by showing that is unstable?
- Question 9 Are there examples of higher-level RDEs that have solutions ?
- Open Problem 10 Is the higher-level RTP from Proposition 16 always endogenous?
Finally, we mention the problem of proving nonendogeny for the frozen percolation of [Ald00], which to our knowledge is still open. Although we did not attempt to solve this problem here, one might hope that the methods of the present paper can provide a useful new point of view on this old problem.
2.4 Outline of the proofs
In the remainder of the paper, we prove all results stated so far, except for Theorems 10 and 14 as well as Proposition 16, which we cite from [MSS18, Thm 1, Thm 13, and Prop 4].
In Section 3 we prove Theorem 1, Propositions 2 and 3, and Lemma 4, which state elementary properties of solutions of the mean-field equations (1.22) and (1.2), as well as Theorem 6, which gives a stochastic representation of solutions of the mean-field equation in terms of finite recursive tree processes. In Section 4, we use this stochastic representation to prove Theorem 5 about convergence of finite systems to a solution of the mean-field equation.
In Section 5, we prove our main results about RTPs with continuous time, which are largely analogous to known results from the discrete-time setting. Basic results are Lemma 7 and Proposition 9, as well as Lemma 8 which deals with discrete time and is a slight reformulation of known results. Following [AB05], Theorem 11 links the -variate equation to endogeny, while Propositions 13 and 15 are concerned with the higher-level equation, and closely follow ideas from [MSS18].
In Section 6 we prove some additional results about RTPs, first Proposition 19, which generalizes [AB05, Lemma 15] and shows that upper and lower solutions of a monotonous RDE are always endogenous, and then Lemmas 20, 21, 23, and 24 which give conditions for uniqueness in a general setting and then more specifically for monotone systems.
In Section 7, finally, we have collected all proofs that deal specifically with our running example of a system with cooperative branching and deaths. The first such result is Proposition 12 about the bivariate equation, which is a two dimensional ODE for which by elementary means we find all fixed points and their domains of attraction. By combining Proposition 12 with ideas involving the convex order we then prove the much stronger Theorem 17 which gives all fixed points and domains of attraction for the higher-level equation. The picture is then completed by the proofs of Lemma 18, which gives some properties of the nontrivial fixed point of the higher-level equation, as well as Lemmas 22 and 25 which illustrate ideas from Section 6 in the concrete set-up of our example.
3 The mean-field equation
In this section, we prove Theorems 1 and 6, which state that the mean-field equation (1.2) has a unique solution and can be represented in terms of a random tree generated by a branching process, with random maps attached to its vertices. In addition, we also prove Propositions 2 and 3, as well as Lemma 4.
In Subsection 3.1, we start with some preliminaries, showing, in particular, that the integral in (1.16) is well-defined, and Lemma 4, which says that mean-field equations of the form (1.22) can be rewritten in the simpler form (1.2).
Next, in Subsection 3.2, we prove uniqueness of solutions of (1.2), which yields the uniqueness parts of Theorem 1. To prove existence, in Subsection 3.3, we show that the right-hand side of (1.49) solves (1.2), which not only completes the proof of Theorem 1 but also yields the stochastic representation that is Theorem 6.
The proofs of Propositions 2 and 3, finally, can be found in Subsection 3.4.
3.1 Preliminaries
Recall that we interpret the mean-field equation (1.2) as in (1.16), where, by (1.12),
[TABLE]
Since by assumption, is jointly measurable in and , the right-hand side of (3.1) is measurable as a function of and hence the integral in (1.16) is well-defined.
Proof of Lemma 4 Recall from Subsection 1.3 that the basic ingredients that go into the equation (1.22) are the measure space and function , as well as, for each and , the function and set . Also, . In terms of these basic ingredients we need to define , and as in Subsection 1.1 so that (1.22) takes the simpler form (1.2).
Since we want to replace the integral and sum in (1.22) by a single integral, we put
[TABLE]
where as before and , and we equip with the measure
[TABLE]
In general, need not be a Polish space, as required in Subsection 1.1. We will fix this problem at the end of our proof, but for the sake of the presentation we neglect it for the moment being. We define as in Subsection 1.1 by , where the right-hand side is the function from Subsection 1.3. We write
[TABLE]
Since depends only on coordinates in , there exists a function such that
[TABLE]
Note that by (1.12). As in (1.3), we can associate with a function that is defined on but depends only on the first coordinates. We take this as our definition of the function from Subsection 1.1. It follows from (1.20) that is jointly measurabe as a function of and .
Replacing the integral and sum in (1.22) by a single integral over as defined in (3.3), using the fact that we see that (1.22) can be rewritten as
[TABLE]
which coincides with (1.2). The condition that should be a finite measure translates to (1.23) (i), while the condition (1.4), written in terms of , becomes (1.23) (ii). Moreover, if satisfies (1.24), then satisfies (1.19).
We still have to fix the problem that , as defined in (3.2), is in general not a Polish space. There are several possible ways to fix this.444For example, we can strengthen our assumptions on in the sense that is a -set for each , or we can relax our assumptions on allowing it to be a Lusin space, instead of just a Polish space, throughout. The solution we will choose is to replace by the Polish space
[TABLE]
were denotes the closure of in . We view as a measure on that is concentrated on and extend and in a measurable way to the larger space, which is possible since is a measurable subset of . Since is concentrated on , it does not matter how we extend and as this has no effect on (3.6).
3.2 Uniqueness
In the present section, we prove that under the assumption (1.4), solutions to (1.2) are unique, which settles the uniqueness part of Theorem 1.
Below, we let denote the space of all finite signed measures on . The total variation norm has already been mentioned several times. There are two conventional definitions, which differ by a factor 2. We will use the definition
[TABLE]
where the supremum runs over all measurable functions . If are -valued random variables, then it is easy to see that . Conversely, it is well-known [Lin92, page 19] that if , then it is possible to couple -valued random variables in such a way that
[TABLE]
Lemma 26 (Lipschitz continuity)
Let be measurable and let be defined as in (1.12). Then
[TABLE]
Moreover, if is defined as in (1.1), then
[TABLE]
**Proof **By (3.9) we can find an -valued random variable such that . Let be i.i.d. copies of . Then, by (1.12),
[TABLE]
This proves (3.10). Formula (3.11) follows by integrating over .
Our next lemma gives equivalent formulations of the mean-field equation (1.2), that will also be useful in the next subsection where we prove existence of solutions. Below, we interpret an integral of a measure-valued integrand in the usual way, i.e., denotes the measure defined by
[TABLE]
for any bounded measurable .
Lemma 27 (Equivalent formulations of the mean-field equation)
Assume (1.4). Let be measurable. Then of the following conditions, (i) implies (ii) and (iii). If is continuous with respect to the total variation norm, then all three conditions are equivalent.
- (i)
For each bounded measurable , the function is continuously differentiable and \displaystyle{\textstyle\frac{{\partial}}{{\partial{t}}}}\mu_{t}=|{\mathbf{r}}|\big{\{}T(\mu_{t})-\mu_{t}\} . 2. (ii)
\displaystyle\mu_{t}=\mu_{0}+|{\mathbf{r}}|\int_{0}^{t}\!\mathrm{d}s\,\big{\{}T(\mu_{s})-\mu_{s}\}* .* 3. (iii)
* .*
**Proof **Integrating the equation in (i) from time 0 until time , we see that (i) implies (ii). Also, we can equivalently write the equation in (i) as
[TABLE]
Integrating from time 0 until time now yields
[TABLE]
Multiplying by and substituting in the integral then yields the equation in (iii).
If is continuous with respect to the total variation norm, then Lemma 26 together with (1.4) imply that also is continuous with respect to the total variation norm. It follows that and are continuous for each bounded measurable . As a result, the right-hand side of (ii), integrated against any bounded measurable , is continuously differentiable as a function of , and (ii) implies (i). By the same argument, rewriting (iii) as (3.15) and differentiating, we see that (iii) implies (i).
We now prove the promised uniqueness of solutions to (1.2). Proposition 2, which will be proved in Subsection 3.4 below, shows that the constant from (3.17) is not optimal and can be replaced by the constant from (1.18).
Lemma 28 (Uniqueness)
Let and be solutions of the mean-field equation (1.2). Then
[TABLE]
where
[TABLE]
**Proof **Equation (ii) of Lemma 27 implies that
[TABLE]
where using (3.11) of Lemma 26. The claim now follows from Gronwall’s lemma [EK86, Thm A.5.1].
3.3 The stochastic representation
In this section, we prove the following proposition, that settles the existence part of Theorem 1. Together with Lemma 28, this completes the proof of Theorem 1 and at the same time also proves Theorem 6.
We work in our usual set-up where and are Polish spaces, is measurable, is as in Subsection 1.1, and is a nonzero finite measure on satisfying (1.4). We fix as in Section 1.4 and let be i.i.d. with common law . We let be an independent i.i.d. collection of exponentially distributed random variables with mean and define , , , and as in (1.43), (1.44), and (1.47).
Proposition 29 (Recursive tree representation)
For any , setting
[TABLE]
defines a solution to the mean-field equation (1.2). Moreover, is continuous with respect to the total variation norm.
To prepare for the proof of Proposition 29, we need one lemma. Recall that denotes the length of a word , i.e., . Let
[TABLE]
Fix and using notation as in (1.46), set
[TABLE]
and set . The following lemma is a “cut-off” version of Proposition 29.
Lemma 30 (Representation with cut-off)
The measures defined in (3.21) satisfy
[TABLE]
**Proof **Let be i.i.d. with common law , independent of . Set
[TABLE]
and define inductively by
[TABLE]
Then X_{\varnothing}=G_{t,(n)}\big{(}(X_{\mathbf{i}})_{\mathbf{i}\in\nabla{\mathbb{S}}_{t,(n)}}\big{)} and hence, in the same way as (1.49) is equivalent to (1.51),
[TABLE]
Conditioning on and then also on , we see that
[TABLE]
where we have used that is independent of with law . We see from this that
[TABLE]
where we have used (1.1) and the observation that conditional on and , the random variables are i.i.d. with common law .
Proof of Proposition 29 The condition (1.4) guarantees that is a finite mean branching process; more precisely, by standard theory,
[TABLE]
Fix and define and as in (3.19) and (3.22). Then the total variation norm distance between these measures can be bounded by
[TABLE]
which tends to zero as since is a.s. finite by (3.28). In fact, since for all , we have that
[TABLE]
Using this and the Lipschitz continuity of with respect to the total variation norm (Lemma 26), we can let in (3.22) to obtain
[TABLE]
Since
[TABLE]
using the fact that the branching process a.s. does not jump at deterministic times, we see that is continuous with respect to the total variation norm. Using this and (3.31), we see from Lemma 27 that solves the mean-field equation (1.2).
3.4 Continuity in the initial state
In this subsection, we prove Propositions 2 and 3.
Proof of Proposition 2 It follows from Theorem 6 and Lemma 26 that
[TABLE]
where is the filtration defined in (1.48).
Proposition 3 follows from the following two lemmas.
Lemma 31 (Continuity of )
Under the condition (1.19), the operator in (1.1) is continuous w.r.t. the topology of weak convergence.
**Proof **If converge weakly to a limit , then by Skorohod’s representation theorem there exists random variables with laws that converge a.s. to a limit with law . Let \big{(}(X^{n}_{i})_{n\in{\mathbb{N}}\cup\{\infty\}}\Big{)}_{i\geq 1} be i.i.d. copies of such a sequence and let be an independent random variable with law . Then by (1.19),
[TABLE]
and hence converges weakly to by (1.1).
Lemma 32 (Continuity in the initial state)
Assume that the operator in (1.1) is continuous w.r.t. the topology of weak convergence. Then the same is true for the operators defined in (1.6).
**Proof **We need to show that solutions of the mean-field equation (1.2) are continuous in their initial state, in the sense that if are started in initial states such that , then for all .
To see this, inductively define as in (3.22) with replaced by . Using the continuity of , by induction, we see that as for all and . By (3.29), for each bounded continuous , the quantity converges to uniformly in , which allows us to conclude that as for all .
4 Approximation by finite systems
4.1 Main line of the proof
In this section, we prove Theorem 5. The basic idea, which already goes back to [DFL86], is that in the mean-field limit, the genealogy of a site converges to a branching process, and sites are independent in the limit. More precisely, consider sites, sampled uniformly at random from . To find out what their states are at time , we follow the sites back until the last time when a random map is applied that has the potential to change the state of one of our sites. At this point, we stop following that given site but replace it by the sites that are relevant for the outcome of the map at the given site, and we continue in this way. When is large, the new sites that are added in each step are with high probability sites we have not been following before, so that in the limit we obtain a branching process with random maps attached to its branch points. Making this idea precise yields the following proposition, that will be proved in Subsection 4.2 below.
Proposition 33 (State at sampled sites)
For each let be a process as in Theorem 5 started in a deterministic initial state . Fix and let be defined as in (1.6) but with the mean-field equation (1.2) replaced by (1.22). Fix and let be i.i.d. uniformly distributed on and independent of . Then
[TABLE]
where denotes the total variation norm, and the convergence in (4.1) is uniform w.r.t. the initial state .
Proposition 33 allows us to control the mean and variance of , which is enough to prove the convergence of to for fixed times . To boost this up to pathwise convergence, we use the following lemma, that will be proved in Subsection 4.3 below.
Lemma 34 (Tightness in total variation)
For each let be a process as in Theorem 5 started in a deterministic initial state , and let \mu^{N}_{t}:=\mu\big{\{}X^{(N)}(t)\big{\}} denote the empirical measure of . Then there exist random processes such that is a.s. nondecreasing with and
- (i)
\displaystyle{\mathbb{P}}\big{[}\sup_{0\leq t\leq T}|\tau^{N}_{t}-t|\geq\varepsilon\big{]}\underset{{N}\to\infty}{\longrightarrow}0\quad(\varepsilon>0,\ T<\infty), 2. (ii)
* a.s.,*
where denotes the total variation norm and .
In Subsection 4.4, we will derive Theorem 5 from Proposition 33, Lemma 34, and some abstract considerations.
4.2 The state at sampled sites
In this subsection we prove Proposition 33. We start with two preparatory lemmas.
Let be the stochastic flow defined in (1.29), where we have made the dependence on explicit. Let be uniformly distributed on and independent of . For each , let be defined as (note the factor rescaling the speed of time):
[TABLE]
Let be the random map defined in (1.47), where and from Subsection 1.1 are defined in terms of the “ingredients” , and from Subsection 1.3, see the proof of Lemma 4 in Subsection 3.1. Let be i.i.d. uniformly distributed on and independent of . For each , let be defined as
[TABLE]
The following lemma says that for large , the map in (4.2) can be approximated by the map in (4.3).
Lemma 35 (Coupling of maps)
For each , it is possible to couple the random maps and with in such a way that
[TABLE]
**Proof **The essence of the proof can be summarized as follows: since for large , sampling with or without replacement from is almost the same, the genealogy of a given site is approximately given by a branching process. In spite of this simple idea, the proof is quite long, mainly because we have to take care of a lot of definitions, such as the way and are defined in terms of , and in the proof of Lemma 4.
We start by recalling that the random map from (1.47) can be seen as the concatenation of random maps assigned to the branch points of a branching process. We then embed this branching process in the set and prove that what we obtain is a good approximation for the genealogy of a given site.
We observe that in order to construct the map from (1.47), it suffices to know
[TABLE]
where is defined in (1.44). Indeed, from the information in (4.5) we can determine , since
[TABLE]
and the map is obtained by concatenating the maps with according to the tree structure of .
The object in (4.5) is in fact a Markov chain as a function of . Starting from the initial state and , its evolution is as follows: Independently for each , with rate , we add to and assign to it a value chosen according to the probability law .
We will be interested in the process in (4.5) in the special case when , and are defined in terms of , , and as in the proof of Lemma 4. In this case, elements of are pairs where and , so we denote the process in (4.5) as
[TABLE]
where and . The set is now given by
[TABLE]
Defining as in (3.3), the process in (4.5) now evolves in such a way that independently for each , with rate , we add to and assign values to it that are chosen according to the probability law .
Let be fixed. Our next aim is to “embed” the process from (4.7) in the set , in such a way that it approximates the genealogy of the site . To this aim, we define, for each time, a random function . Initially, we set . We let the function evolve in a Markovian way together with the process in (4.7) in the following way. Recall that when we add an element to and assign values to it, this element is at the same time removed from and replaced by new elements . We assign labels to these new elements as follows. First, we choose in such a way that and
[TABLE]
and next, we set , where as in (3.4), we order the elements of as
[TABLE]
Note that this has the effect that if is an element of , say , then the corresponding element gets the same label as , i.e., . Otherwise, we assign new i.i.d. labels to all new elements of .
Using the function that embeds the process in (4.7) in the set , we define a function by
[TABLE]
We now consider the maps
[TABLE]
where is the label initially assigned to the root. We claim that
[TABLE]
where denotes the total variation norm. In particular, if is chosen uniformly distributed in and independent of everything else, then are i.i.d. uniformly distributed in and independent of the map , so (4.13) implies (4.4).
To prove (4.13), we construct a process similar to the process in (4.7), together with an embedding in , that describes the true genalogy of the site , and show that the error we make by replacing this true genealogy by the process we had before is small. We denote this process as
[TABLE]
At each time, is defined in terms of this process in the same way as is defined in (4.8). We also define and as before, i.e., is the concatenation of the random maps with according to the tree structure of , and is defined in terms of as in (4.11).
Recall that . As for our previous process we start with , , and . In Subsection 1.3, the stochastic flow is constructed from a Poisson point set . We will construct the process in (4.14) in terms of in such a way that
[TABLE]
which expresses the fact that the process in (4.14) describes the “true genealogy” of the site .
The Poisson set consists of triples which express the fact that at time the random map should be applied to the coordinates . Note that we are interested in , which means that we look at negative times and need to rescale time by a factor . For each and such that for some , we update the process in (4.14) as follows:
- (i)
We remove from and add it to . 2. (ii)
We set and . 3. (iii)
We add to . 4. (iv)
We define , where as in (3.4).
It is straightforward to check that these rules guarantee that (4.15) holds and hence the process in (4.14) describes the true genealogy of the site . As some more explanation, we can add the following: we follow a site back in time till the first time when a map is applied that has the possibility to change the value of . From that moment on, we follow back all sites that are relevant for the outcome of the map at , and we number them according to the convention in (3.4). This defines a family structure, i.e., is the -th child of the -th child of the -th child of the original site . The map applied to tells us where this ancestor lives in the set . There may be some overlap, i.e., it is possible that for some . For , however, the probability that two ancestors live at the same site in tends to zero as , as we will see in a moment.
In view of (4.15), to prove (4.13), it suffices to prove that the Markov process in (4.14) is close in total variation distance to the process with and replaced by and . Since the latter process is nonexplosive by (3.28), it suffices to prove convergence for the processes stopped at the first time when the cardinality of resp. exceeds a certain value, and then at the end send this value to infinity. We will prove convergence of the stopped processes in a number of steps, by making small changes in the jump rates. Here we use the fact that if the transition kernels of two continuous-time Markov chains are close in total variation norm, uniformly in the starting point, then by standard arguments the two processes can be coupled so that their laws at fixed time are close in total variation norm.
Let denote the image of under the map . As a first step, we change the dynamics of the (stopped) process from (4.14) in such a way that elements have no effect if intersects in more that one point. Then the modified process is still Markovian; we claim the change in jump rates compared to the original process is of order . Indeed, for fixed , if are chosen uniformly without replacement from , then the probability that one, resp. two or more of them lie in a set of fixed cardinality is of order resp. as . Taking into account the fact that we rescale time by a factor , as well as the summability condition (1.23) (i), this translates into a change in jump rates of order for the modified process, stopped at the first time when the cardinality of exceeds a fixed value.
Recall that by (4.15), is a function only of . The modified process we have just constructed has the property that is a bijection, i.e., each element corresponds only to a single place in the family tree. The dynamics of the modified process can be described as follows:
- (i)
Independently for each , with rates described by the measure from (3.3), we choose a pair with . 2. (ii)
If , we do nothing. 3. (iii)
Otherwise, we choose such that and are drawn from without replacement. 4. (iv)
If some of the are elements of , we do nothing. 5. (v)
Otherwise, we remove from and add to , where with . 6. (vi)
If is the place of in the family tree immediately prior to time , then we assign to each new element of a place in the family tree by setting .
Note that the measure from (3.3) occurs naturally here, since each -tuple of sites in can contain a given site in different ways, as its 1st, 2nd,…, -th member.
Removing the restrictions in points (ii) and (iv) above, and performing sampling without replacement instead of sampling with replacement in point (iii), we only make changes in the transition rates of order , and arrive at a process whose family tree evolves as the process in (4.7) and where to new members of the family tree, sites in are assigned chosen uniformly with replacement, as described by the process .
In the proof of Lemma 35, we have seen that in the mean-field limit , the genealogy of a single site can be approximated by a branching process with random maps attached to its branch points. Similarly, the genealogy of randomly chosen sites can be approximated by independent branching processes, which leads to the following extension of Lemma 35.
Lemma 36 (The genealogy of multiple sites)
Let be the stochastic flow defined in (1.29) and let be i.i.d. uniformly distributed on , independent of . Let and be defined in terms of , and as in the proof of Lemma 4. Fix and let be i.i.d. copies of the random set and map defined in (1.44) and (1.47). Conditional on , let be i.i.d. uniformly distributed on . Define and by
[TABLE]
Then and can be coupled such that \displaystyle{\mathbb{P}}\big{[}\tilde{M}^{N}_{t}\neq M^{N}_{t}\big{]}\underset{{N}\to\infty}{\longrightarrow}0.
**Proof **The proof is the same as the proof of Lemma 35, except that instead of following back the genealogy of one site, one follows the genealogies of sites. By the same arguments as given in the proof of Lemma 35, when is large, with high probability, the genealogies do not intersect, and hence can be approximated by independent branching processes. Although writing down all objects involved is notationally complicated, no new ideas are needed so we omit the details.
Proof of Proposition 33 Let be the (deterministic) initial state and using notation as in (1.33) let denote its empirical measure. Define maps and as in Lemma 36. Then \big{(}X^{(N)}_{I_{1}}(Nt),\ldots,X^{(N)}_{I_{n}}(Nt)\big{)} has law while the coordinates of are i.i.d. with a law that by Theorem 6 equals . In view of this, the claim follows from Lemma 36.
4.3 Tightness in total variation
In this subsection we prove Lemma 34.
Proof of Lemma 34 The process is defined in (1.32) in terms of a stochastic flow which is in turn defined in terms of a Poisson set . Elements of are triples which tell us that at time the map should be applied to the coordinates . We let
[TABLE]
where , which is finite by (1.23). Then (i) follows from a functional law of large numbers. Since for any , the fraction of sites in that changes its type is bounded from above by , in view of (3.9), we obtain also (ii).
4.4 Convergence to the mean-field equation
In this subsection, we prove Theorem 5. The proof is split into a number of lemmas. We start by proving convergence at fixed times. This part of the proof is based on Proposition 33. At the end of the proof, we use Lemma 34 to obtain pathwise convergence.
Lemma 37 (Expectation of test functions)
Let , and be as in Subsection 1.3, and assume (1.23). Let denote the semigroup defined as in (1.6) but with the mean-field equation (1.2) replaced by (1.22). For each , let be Markov processes with state space as defined in (1.32), and let denote their associated empirical measures. Then
[TABLE]
where the supremum runs over all measurable functions .
**Proof **Fix . Let be measurable. Let and be uniformly distributed on and independent of each other and of . Since
[TABLE]
we see that
[TABLE]
Assume for the moment that is deterministic. Then applying Proposition 33 with we find that
[TABLE]
where we take the supremum over all measurable . It follows that
[TABLE]
and hence (4.18) follows by Chebyshev’s inequality. To obtain (4.18) more generally when is random, we condition on the initial state to get, for each and measurable .
[TABLE]
Since the integrand on the right-hand side does not depend on and tends to zero in a bounded pointwise way as a function of , (4.18) follows.
Our next aim is to prove that if in addition to the assumptions of Lemma 37, condition (i) or (ii) of Theorem 5 is satisfied, then
[TABLE]
where is any metric on that generates the topology of weak convergence. Applying the following well-known fact to the Polish space , we see that if (4.24) holds for one such metric, then it holds for all of them.
Lemma 38 (Convergence in probability)
Let be random variables taking values in a Polish space , let be deterministic, and let be a metric generating the topology on . Then one has
[TABLE]
if and only if
[TABLE]
where denotes weak convergence of probability measures on .
**Proof **It is easy to see that (4.25) implies for all bounded continuous , so (4.25) implies (4.26). Conversely, if (4.26) holds, then by Skorohod’s representation theorem it is possible to couple the random variables such that a.s., which implies (4.25).
The following lemma gives sufficient conditions for the type of convergence of (4.24).
Lemma 39 (Convergence to a deterministic measure)
Let be a Polish space, let be deterministic, and let be random variables with values in . Let be a metric on generating the topology of weak convergence. Then the following conditions are equivalent.
- (i)
\displaystyle{\mathbb{P}}\big{[}d(\mu^{N},\mu)\geq\varepsilon\big{]}\underset{{N}\to\infty}{\longrightarrow}0* for all .* 2. (ii)
\displaystyle{\mathbb{P}}\big{[}\big{|}\langle\mu^{N},\phi\rangle-\langle\mu,\phi\rangle\big{|}\geq\varepsilon\big{]}\underset{{N}\to\infty}{\longrightarrow}0* for all and bounded continuous .* 3. (iii)
\displaystyle{\mathbb{E}}\big{[}\prod_{i=1}^{n}\langle\mu^{N},\phi_{i}\rangle\big{]}\underset{{N}\to\infty}{\longrightarrow}\prod_{i=1}^{n}\langle\mu,\phi_{i}\rangle* for all bounded continuous functions .*
**Proof **We equip with the topology of weak convergence, making it into a Polish space. Then by Lemma 38, condition (i) is equivalent to
- (i)’
.
We will prove (i)’(ii)(iii)(i)’.
(i)’(ii). By Skorohod’s representation theorem, (i)’ implies that the can be coupled such that a.s., which implies (ii).
(ii)(iii). Without loss of generality we may assume that the ’s take values in . Since the function is continuous, (ii) implies that
[TABLE]
for all and bounded continuous functions . Since moreover , this implies (iii).
(iii)(i)’. Since is Polish, it has a metrizable compactification, i.e., there exists a compact metrizable space such that is a dense subset of and the topology on is the induced topology from [Cho69, Theorem 6.3]. It is known that this implies that is a -subset of [Bou58, §6 No. 1, Theorem. 1]. In particular, is a Borel measurable subset of and we can identify with the space of probability measures on that are concentrated on . If we equip with the topology of weak convergence, then the induced topology on is also the topology of weak convergence (this follows, e.g., from [EK86, Thm 3.3.1]), and in fact (being compact by Prohorov’s theorem) is a metrizable compactification of .
We view and as probability measures on . Since is compact, so are and , so by going to a subsequence if necessary, we can assume that the laws converge weakly to some limit . Since the restriction to of a continuous function is a bounded continuous function on , condition (ii) implies that
[TABLE]
for general and continuous functions . By the Stone-Weierstrass theorem, the linear span of functions of the form is dense in the space of continuous functions on , and hence (4.28) implies .
We now prove (4.24) under either of the conditions (i) and (ii) of Theorem 5.
Lemma 40 (Continuity argument)
In addition to the assumptions of Lemma 37, assume that condition (i) of Theorem 5 is satisfied. Then (4.24) holds.
**Proof **Fix . In view of Lemma 39 (ii), it suffices to show that
[TABLE]
for any bounded continuous . By Lemma 37, it suffices to show that
[TABLE]
By the second part of condition (i), Lemma 4, and Proposition 3, the operator is continuous w.r.t. weak convergence. In view of this, (4.30) is implied by the first part of condition (i).
Lemma 41 (Moment argument)
In addition to the assumptions of Lemma 37, assume that condition (ii) of Theorem 5 is satisfied. Then (4.24) holds.
**Proof **Fix . In view of Lemma 39 (iii), it suffices to show that
[TABLE]
for all and bounded continuous functions , . Without loss of generality we may assume that the ’s take values in . Let be as in Theorem 5 and let be i.i.d. uniformly distributed on and independent of . Then
[TABLE]
By Proposition 33 applied to the process conditioned on , there exist such that
[TABLE]
In view of (3.8), it follows that
[TABLE]
Combining this with (4.32), taking the expectation, we obtain that
[TABLE]
In view of this, to prove (4.31), it suffices to show that
[TABLE]
If is deterministic, then Theorem 6 tells us that
[TABLE]
where and are as in (1.44) and (1.47) and are i.i.d. with law . Conditional on , let be i.i.d. with common law . Let be i.i.d. and distributed as the random variables in (1.44) and (1.47), independent of and . Then (4.37) implies that
[TABLE]
If we replace the expectation on the right-hand side by a conditional expectation given , then this is the integral of a measurable -valued function with respect to the expectation of a product measure of the form , where . Condition (ii) of Theorem 5 allows us to replace the integral w.r.t. by the integral w.r.t. at the cost of a small error. Thus,
[TABLE]
where the are i.i.d. with common law and independent of , and is a random error term that by condition (ii) can be estimated as
[TABLE]
where for each . Note that moreover since the ’s take values in . Integrating over the randomness of , using bounded convergence, (4.37) and (4.38), (4.36) follows.
With Lemmas 40 and 41 proved, most of the work needed for proving Theorem 5 is done. The only remaining task is to improve the convergence at fixed times in (4.24) to pathwise convergence as in (1.34). Our first aim is to show that the condition (1.34) does not depend on the choice of the metric . This follows from the following lemma, applied to the Polish space .
Lemma 42 (Convergence in path space)
Let be a Polish space and let be a metric generating the topology on . Let be the space of cadlag functions , equipped with the Skorohod topology. Let be random variables with values in and let be a continuous function. Then one has
[TABLE]
if and only if
[TABLE]
where denotes weak convergence of probability measures on .
**Proof **It is well-known that is a Polish space [EK86, Sect. 3.5]. Let be the metric generating the topology on defined in [EK86, (5.2) of Chapter 3]. Then it is easy to see that for all there exist and such that
[TABLE]
In view of this, (4.41) implies
[TABLE]
which by Lemma 38 implies (4.42). Conversely, if (4.42) holds, then by Skorohod’s representation theorem it is possible to couple the random variables such that a.s. By the continuity of and [EK86, Lemma 3.10.1], this implies that
[TABLE]
which implies (4.41).
Before the proof of Theorem 5 we need one more lemma.
Lemma 43 (Weak convergence and convergence in total variation norm)
Let be a Polish space. Then there exists a metric on such that generates the topology of weak convergence and , where denotes the total variation norm.
**Proof **Let be a metric generating the topology on . Replacing by if necessary we can assume without loss of generality that . Let be the space of all functions such that , i.e., these are Lipschitz continuous functions with Lipschitz constant . Then
[TABLE]
is the 1-Wasserstein metric on , which is known to generate the topology of weak convergence. Let . Since , each function can be written as with and . In view of this and (3.8),
[TABLE]
Proof of Theorem 5 Lemmas 40 and 41 show that either of the conditions (i) and (ii) implies (4.24). We will use Lemma 34 to improve (4.24) to pathwise convergence as in (1.34). By Lemma 42 it suffices to prove (1.34) for one particular metric on that generates the topology of weak convergence. We will choose a metric as in Lemma 43.
Set denote the solution to the mean-field equation (1.22) with initial state . Lemma 34 implies that
[TABLE]
Taking the limit , using the fact that and (4.24), it follows that
[TABLE]
Since for any ,
[TABLE]
using Lemma 34, (4.24), and (4.49), we see that for each and ,
[TABLE]
Combining this with the fact that by (4.24), for any ,
[TABLE]
we find that
[TABLE]
Since and are arbitrary, this implies (1.34).
5 Recursive Tree Processes
In this section, we prove our main results about RTPs with continuous time. For completeness, we also prove Lemma 8 which deals with discrete time and says that each solution to the RDE (1.54) gives rise to an RTP. This is done in Subsection 5.1
Our basic results about continuous-time RTPs are Lemma 7 and Proposition 9. Lemma 7 describes the evolution of the law of the process
[TABLE]
that is constructed by assigning independent values to elements and then calculating backwards. Proposition 9 says that adding exponential lifetimes to the elements of an RTP yields a stationary version of the process in (5.1). These results are proved in Subsection 5.2.
In Subsection 5.3, we prove continuous-time analogues of known discrete-time results related to endogeny. Following [AB05], Theorem 11 links the -variate mean-field equation to endogeny, while Propositions 13 and 15 are concerned with the higher-level mean-field equation, and closely follow ideas from [MSS18].
5.1 Construction of RTPs
Proof of Lemma 8 For each finite subtree that contains the root, we can construct random variables and such that the are independent with common law , the are i.i.d. with common law and independent of the , and the are inductively defined by
[TABLE]
The joint law of and is a probability law on . Since and are Polish spaces, we can apply Kolmogorov’s extension theorem. The statement of the lemma then follows provided we can show that the laws are consistent in the sense that if is another subtree that contains the root, then the projection of on equals . It suffices to prove this when and differ by one element only, say where . It follows from (5.2) and the fact that solves the RDE (1.54) that has law and is independent of , and from this we see that the projection of is indeed .
It will be useful in what follows to have a somewhat stronger version of Lemma 8 that applies also to certain random subtrees . Let denote the set of all finite subtrees such that either or . Let us define a stopping tree to be a random variable with values in such that
[TABLE]
In the special case that and , a stopping tree is just a stopping time w.r.t. the filtration generated by .
Lemma 44 (RTPs and stopping trees)
Let be an RTP corresponding to a map and a solution to the RDE (1.54), and let be a stopping tree. Then conditional on , the random variables are i.i.d. with common law and independent of .
**Proof **For each fixed , by Lemma 8, conditional on , the random variables are i.i.d. with common law . By (5.3), it follows that conditional on the event and , the random variables are i.i.d. with common law . Since this holds for all , and since a.s., the claim follows.
5.2 Continuous-time RTPs
In this subsection, we prove Lemma 7 and Proposition 9. We work in our usual set-up as described above Proposition 29. We start with a preparatory lemma that says that if we condition on the -field defined in (1.48), then the subtrees of rooted at are i.i.d. with the same distribution as . To formulate this properly, we need some notation.
We call the object
[TABLE]
a marked branching tree. For each , let describe the subtree of that is rooted at , i.e.,
[TABLE]
We set , so that is the random element of that “belongs” to . Fix . For each , let describe the lifetime of an individual after time , i.e.,
[TABLE]
where is the age of the individual at time .
Lemma 45 (Memoryless property)
For each , conditional on the -field , the marked branching trees
[TABLE]
are i.i.d. with the same distribution as the marked branching tree in (5.4).
**Proof **Let be as defined above (5.3). Then, for each , the event is measurable w.r.t. the -field generated by the random variables
[TABLE]
Note that here is measurable w.r.t. the -field generated by while for each , the random variable is measurable w.r.t. the -field generated by .
Conditional on and the random variables in (5.8), the random variables are still i.i.d. with their original law and independent of . The latter are also still independent of each other and the still have their original law, but the laws of are changed since conditioning on entails conditioning on for each .
Since this holds for each , we see that if we condition on as in (1.48), then under the conditional law the random variables and with are still independent, and all of these random variables still have their original law, except the with , whose laws are conditioned on the events . From this observation, using the memoryless property of the exponential distribution, the claim of the lemma follows.
For each and , within the marked branching tree \big{(}{\mathbb{S}}^{\mathbf{i}},(\omega^{\mathbf{i}}_{\mathbf{j}},\sigma^{\mathbf{i},s}_{\mathbf{j}})_{\mathbf{j}\in{\mathbb{S}}^{\mathbf{i}}}\big{)} rooted at , we define the birth and death times and as in (1.41), with replaced by , and we use this to define and as in (1.44). Finally, we define as in (1.46) and (1.47).
Proof of Lemma 7 We fix a marked branching tree as in (5.4) and times . Conditional on , we assign i.i.d. with common law to the leaves of and define inductively as in (1.50).
We observe that is given by the disjoint union
[TABLE]
Conditioning on is the same as first conditioning on
[TABLE]
and then conditioning on
[TABLE]
which by Lemma 5.7 are conditionally independent given the random variable in (5.10). Set
[TABLE]
Then
[TABLE]
In view of this, by Theorem 6, conditional on the the random variable in (5.10), i.e., conditional on , the random variables are i.i.d. with common law , where denotes the solution of the mean-field equation (1.2) with initial state .
Proof of Proposition 9 Since and are independent, the conditional law of given is the same as the unconditional law. We claim that under the conditional law given , the random finite subtree is a stopping tree in the sense of (5.3). Indeed, if and only if for each and (resp. , depending on how is chosen), one has if and only if
[TABLE]
Here the event in (i) is clearly measurable w.r.t. while under the conditional law given , (ii) is just a deterministic condition. We can therefore apply Lemma 44 to conclude that conditional on , , and , the random variables are i.i.d. with common law .
We observe that
[TABLE]
is a function of , and . Therefore, if we condition on , the random variables are i.i.d. with common law . This proves (1.59) (i). Condition (1.59) (ii) is also clearly fulfilled by the definition of an RTP.
5.3 Endogeny, bivariate uniqueness, and the higher-level equation
In this subsection, we prove Theorem 11 and Propositions 13 and 15.
Recall that an RTP is endogenous if is measurable with respect to the -field generated by the random variables . In general, if is a random variable taking values in a Polish space and is a sub--field, then it is not hard to see that is a.s. equal to a -measurable function if and only if the conditional law is a.s. a delta-measure. In view of this, the following lemma implies that an RTP is endogenous if and only if is a.s. measurable w.r.t. the -field generated by the random variables and .
Lemma 46 (Relevant randomness)
Let be an RTP corresponding to a solution of the RDE (1.54). Let be the -field generated by the random variables and let be the -field generated by the random variables and . Then
[TABLE]
**Proof **Since is generated by and the random variables , formula (5.16) says that conditional on on , the random variables are independent of . Let be deterministic finite rooted subtrees of that increase to . Let be the -field generated by and let be the -field generated by and . Conditional on , the state at the root is a deterministic function of . Therefore, by point (ii) in the definition of an RTP in Lemma 8, is conditionally independent of given , or equivalently,
[TABLE]
for each measurable . Letting , using martingale convergence, we arrive at (5.16).
The following lemma prepares for the proof of Theorem 11.
Lemma 47 (Successful coupling)
Let be an endogenous RTP corresponding to a solution of the RDE (1.54) and let be an independent i.i.d. collection of exponential random variables with mean . Furthermore, let be an i.i.d. collection of -valued random variables with common law , independent of . For each , define random variables by
[TABLE]
Then
[TABLE]
**Proof **The following argument is a continuous-time version of the proofs of [AB05, Thm 11 (c)] and [MSS18, Lemma 6]. Let be the filtration defined in (1.48). We add a final element to the filtration, which is the -algebra generated by the random tree and the random variables . Let be bounded and measurable functions. Since and are conditionally independent and identically distributed given , we have
[TABLE]
where we used the martingale convergence and in the last equality also endogeny and Lemma 46. Since (5.20) holds in particular for any bounded continuous and , we conclude that the law of converges weakly to the law of , which implies (5.19).
Proof of Theorem 11 If (ii) holds, then is the only fixed point in of the bivariate mean-field equation. Since a measure is a fixed point of the bivariate mean-field equation if and only if it is a fixed point of the map , by Theorem 10, it follows that the RTP corresponding to is endogenous.
Assume, conversely, that the RTP corresponding to is endogenous. Let be a collection of i.i.d. -valued random variables with common law , independent of the RTP and the exponential lifetimes . For each and , define random variables by
[TABLE]
Then, by Theorem 6 applied to the -variate map , we see that has law . By endogeny we get from Lemma 47 that
[TABLE]
This completes the proof since the right-hand side of (5.22) has law as defined in (1.62).
Proof of Proposition 13 The fact that solves the higher-level mean-field equation (1.71) means that
[TABLE]
for any bounded measurable . In particular, we can apply this to functions of the form
[TABLE]
where is bounded and measurable. Then
[TABLE]
where denotes the -th moment measure of . By [MSS18, Lemma 2],
[TABLE]
Inserting this into (5.23), we see that solves the -variate mean-field equation.
The following lemma prepares for the proof of Proposition 15.
Lemma 48 (Conditional law of the root)
Let be an RTP corresponding to a solution of the RDE (1.54), let be an independent i.i.d. collection of exponentially distributed random variables with mean , and let be the filtration defined in (1.48). Then the measures
[TABLE]
solve the higher-level mean-field equation (1.71) with initial state .
**Proof **Conditional on , the map is a deterministic map, and are i.i.d. with common law . Therefore, applying [MSS18, Lemma 8] to the case that the -fields there are all trivial and the probability measure there is replaced by the conditional law given , we see that
[TABLE]
Now by Theorem 6,
[TABLE]
solves the higher-level mean-field equation (1.71) with initial state .
Proof of Proposition 15 Let be solutions to the higher-level mean-field equation (1.71) such that . Define as in (3.22), with replaced by the higher-level map from (1.73). It has been shown in [MSS18, Prop 3] that is monotone w.r.t. the convex order, so by induction we obtain from (3.22) that for all and . Letting , using (3.30), we see that for all .
Let be a solution of the RDE (1.54). It has been shown in [MSS18, Prop. 3] that solves the higher-level RDE (1.73) and there exists a (necessarily unique) solution of (1.73) such that (1.76) holds. It has moreover been shown in [MSS18, Prop. 4] that is given by (1.79). In view of this, to complete the proof, it suffices to show that the solution to the higher-level mean-field equation (1.71) with initial state converges to the measure in (1.79).
We apply Lemma 48. As in the proof of Lemma 47, we add a final element to the filtration, which is the -algebra generated by the random tree and the random variables . Then, by martingale convergence,
[TABLE]
and hence the measures in (5.27) satisfy
[TABLE]
where denotes weak convergence of probability measures on , which is in turn equipped with the topology of weak convergence of probability measures on . Since the exponentially distributed random variables are independent of the RTP , we have
[TABLE]
where as in Lemma 46 denotes the -field generated by the random variables and and the last equality follows from that lemma. Inserting this into (5.31) we see that converges weakly to as defined in (1.79).
6 Further results
In this section, we prove some additional results about RTPs. In Subsection 6.1, we prove Proposition 19 about the upper and lower solutions of a monotonous RDE. In Subsection 6.2 we prove Lemmas 20, 21, and 23 which give conditions for uniqueness of solutions to an RDE. Subsection 6.3 is devoted to the proof of Lemma 24.
6.1 Monotonicity
In this subsection, we prove Proposition 19. We start with a number of simple lemmas.
Lemma 49 (A continuous monotone function)
Let be a compact metrizable space that is equipped with a closed partial order in the sense of (1.90), and let be a metric that generates the topology. Then
[TABLE]
defines a continuous function such that if and only if and moreover is decreasing in and increasing in .
**Proof **Since for any ,
[TABLE]
the function is continuous. Assume that converge to a limit . Since the infimum of a family of continuous functions is upper semi-continuous, we have
[TABLE]
To prove that is actually continuous, assume the converse. Then there exists a sequence such that
[TABLE]
for some . By the definition of , there exist and such that . Since is compact, we can select a subsequence such that (6.4) still holds and the converge to a limit . Since the partial order is closed in the sense of (1.90), we have and , so
[TABLE]
which contradicts (6.4). We conclude that is continuous.
If , then setting shows that . Conversely, if then there exist and such that . Using the compactness of , by going to a subsequence, we can assume that the converge to a limit . Since the partial order is closed in the sense of (1.90), and hence .
If and , then
[TABLE]
since the second infimum is taken over a smaller set, showing that is decreasing in and increasing in .
Lemma 50 (Comparison principle)
Let be a compact metrizable space that is equipped with a partial order that is closed in the sense of (1.90). Let be -valued random variables such that a.s. and . Then a.s.
Proof of Lemma 50 Set with as in Lemma 49. Then, for each , is continuous and monotone increasing, and if and only if . Let
[TABLE]
We will prove the lemma by showing that if are -valued random variables such that , then for some contradicting . For each and , we define an open set by
[TABLE]
Since for each , one has but , we see that
[TABLE]
We now use the inner regularity of measures on Polish spaces w.r.t. compacta, which follows from the regularity and tightness of any probability measure on a Polish space [Par05, Thm. 1.2 and 3.2]. Thus, we can find a compact set such that . Since is compact, it is covered by finitely many sets of the form (6.8), so there must exists a and such that . Since is monotone increasing and it follows that .
Lemma 51 (Compatibility of the stochastic order)
Assume that is equipped with a partial order that is closed in the sense of (1.90). Then the stochastic order on is closed with respect to the topology of weak convergence.
**Proof **We need to show that if for all and the converge weakly as to a limit , then . Since , for each , we can couple with laws such that . Since and converge as , the joint laws of are tight, so by going to a subsequence we may assume that they converge. Then, by Skorohod’s representation theorem, we can couple the random variables for different in such a way that they converge a.s. to a limit . Since the partial order on is closed, we have a.s., proving that .
Lemma 52 (Monotonicity of )
Assume that is equipped with a partial order that is closed and that is monotone for all . Then the operator in (1.1) is monotone w.r.t. the stochastic order.
**Proof **If , then we can couple random variables and with laws such that . Let be i.i.d. copies of . Then
[TABLE]
for all and hence by (1.1).
In practice, Lemma 52 is the usual way to prove monotonocity of a map of the form (1.1). Nevertheless, it is known that there are maps of the form (1.1), in particular, probability kernels, that are monotone yet cannot be represented in terms of monotone maps [FM01, Example 1.1].
Lemma 53 (Monotonicity in the initial state)
Assume that is equipped with a partial order that is closed and that the operator in (1.1) is monotone w.r.t. the stochastic order. Then solutions of the mean-field equation (1.2) started in initial states satisfy .
**Proof **Inductively define as in (3.22) with replaced by . Then for all and . Letting , we see as in the proof of Proposition 29 that as . By Lemma 51, we conclude that .
In the next two lemmas we need to assume compactness of .
Lemma 54 (Increasing limits)
Assume that is a compact metrizable space equipped with a partial order that is closed. Then every increasing sequence in converges to a limit.
**Proof **Let be a sequence in such that for all . By compactness, it suffices to prove that all subsequential limits are the same. Let and be subsequences that converge to limits and , respectively. For all , let . Then for all and letting , using the compatibility condition (1.90), we see that . The same argument gives and hence .
Lemma 55 (Increasing limits in the stochastic order)
Assume that is a compact metrizable space equipped with a partial order that is closed. Then every sequence in that is increasing in the stochastic order converges weakly to a limit.
**Proof **Let be increasing in the stochastic order. Then, for each , we can couple random variables and with laws and such that . Let and let be a time-inhomogeneous Markov chain with initial law and transition kernels . Then a.s. for all and hence the a.s. increase to a limit by Lemma 54. It follows that the converge weakly to the law of .
We now turn to the proof of Proposition 19.
Lemma 56 (Lower and upper solutions)
All conclusions of Proposition 19 except for the statement about endogeny hold when the assumption that is monotone for all is replaced by the weaker condition that is monotone.
**Proof **The proof is similar to the proof of [AB05, Lemma 15], which in turn is based on well-known principles [Lig85, Thm III.2.3]. By symmetry, it suffices to prove the statement for .
Since for each , solves (1.2) with initial state , we conclude from Lemmas 52 and 53 that for each and hence is increasing w.r.t. the stochastic order. By Lemma 55, it follows that for some probability measure on . Since for all , using Lemma 32 and the continuity of , we see that is a fixed point of the mean-field equation (1.2) and hence solves the RDE (1.54).
If is any solution of the RDE (1.54), then for all by Lemma 53 and the fact that is a fixed point of (1.2). Letting , using Lemma 51, we see that .
Lemma 57 (Random maps applied to extremal elements)
Under the assumptions of Proposition 19, if are i.i.d. with common law , then there exist random variables and with laws and that are given by the decreasing, resp. increasing limits
[TABLE]
where the limit does not depend on the choice of the sequence such that . Here denotes the set of all finite subtrees such that either or , and for each , the random map is defined in (1.46).
**Proof **By symmetry, it suffices to prove the statement for . Since is monotone for each , the map is monotone for each . Define
[TABLE]
Then for all and hence if increase to , then the increase to a limit that does not depend on the choice of the sequence .
Let be an independent i.i.d. collection of exponential random variables with mean . Define as in (1.44). Then by Theorem 6, has law while by what we have already proved increases to . Since , it follows that has law .
Proof of Proposition 19 In view of Lemma 56, it only remains to prove the statement about endogeny. Let be an RTP corresponding to and some solution to the RDE (1.54). Then
[TABLE]
with as in (6.12). So letting , using the fact that the partial order is closed, we obtain that . In particular, if , then since also has law , Lemma 50 tells us that a.s. Since the latter is measurable w.r.t. the -field generated by the , this proves the endogeny of the RTP corresponding to and .
6.2 Conditions for uniqueness
In this subsection, we prove Lemmas 20, 21, and 23.
Proof of Lemma 20 If is constant then is a root determining subtree, proving the implication (i)(ii). Conversely, if there a.s. exists a root determining subtree , then, since , there a.s. exists a (random) such that and hence is constant for all . The implication (iii)(ii) is trivial. Conversely, if contains a root determining subtree , then by the finiteness of the latter we can keep removing elements from as long as this is still possible while retaining the property that is root determining.
Proof of Lemma 21 (i)(ii): This is clear, since a finite uniquely determined subtree is root determining.
(ii)(iii): For each , let , defined in (5.5), denote the subtree of that is rooted at . Since is equally distributed with , by (ii), for each , there a.s. exists a root determining subtree . Since implies x_{\mathbf{i}}=G_{{\mathbb{U}}^{\mathbf{i}}}\big{(}(x_{\mathbf{i}\mathbf{j}})_{\mathbf{j}\in\nabla{\mathbb{U}}^{\mathbf{i}}}\big{)} and is constant, it follows that is a.s. uniquely determined.
(ii)(v): Since is constant, we can define
[TABLE]
where the right-hand side does not depend on the choice of . It is straightforward to check that satisfies conditions (i)–(iii) of Lemma 8 and hence is an RTP corresponding to . It follows that solves the RDE (1.54).
Let be an independent i.i.d. collection of -valued random variables with common law , let be an independent i.i.d. collection of exponential random variables with mean , and define X^{t}_{\varnothing}:=G_{t}\big{(}(Y_{\mathbf{i}})_{\mathbf{i}\in\nabla{\mathbb{S}}_{t}}\big{)}. Then has law by Theorem 6. Since with , we see from (6.14) that as , proving that .
(iii)(iv): We note the following general principle: if are Polish spaces and and are random variables taking values in resp. such that and are equal in law, then we can couple and such that . To see this, let denote the law of , let denote a regular version of the conditional law of given resp. , and define the joint law of as
[TABLE]
i.e., make and conditionally independent given . Applying this general principle, we see that if are solutions to the RDE (1.54), then we can couple the associated RTPs and in such a way that for all . Since is a.s. uniquely determined, it follows that a.s. and hence . The same argument also shows that any solution to the bivariate RDE is concentrated on the diagonal, which by Theorem 10 implies endogeny.
(iii) and finite imply (ii): Since and root determining imply that is root determining, we see that decreases to . Assume that this event has positive probability and condition on it. Choose . Then there exist such that . Since is finite, the sequences and have subsequences that converge pointwise for each to limits . It is easy to see that . Moreover, . This shows that on the event , the tree is not uniquely determined.
(ii) and imply (i): It suffices to show that each root determining subtree of contains a uniquely determined subtree. For any , let denote and its descendants, and let denote the set of all that satisfy
[TABLE]
Define by
[TABLE]
We claim that
[TABLE]
Indeed, this follows from the fact that the sets for different are mutually disjoint, which allows us to choose independently for each .
Note that for and if is root determining. Let be the connected component of \{\mathbf{i}\in{\mathbb{U}}:|\chi_{\mathbf{i}}|=1\big{\}} that contains . Since has only two elements, for . Using (6.18), it follows that is uniquely determined.
For completeness, we give three examples to show that the implications (ii)(i), (iii)(ii), and (v)(ii) do not hold in general. In all of these examples, for all , which means , where denotes the word of length made from the alphabet . It follows that the operator from (1.1) is just the linear operator associated with the transition kernel of a Markov chain. In all our examples, we take for all , where is a fixed map.
Example 58 ((ii)(i))
Let and . Then a.s. contains a root determining subtree but a.s. does not contain a uniquely determined subtree.
**Proof **The subtree is root determining, since for all . On the other hand, if is a finite subtree of that contains the root, then there exist with and , which shows is not uniquely determined.
Example 59 ((iii)(ii))
Let and
[TABLE]
Then is a.s. uniquely determined but a.s. contains no root determining subtree.
**Proof **If satisfies for some , then and for all , which leads to a contradiction. It follows that contains a single element, which is given by for all . In particular, is uniquely determined. On the other hand, for each finite subtree that contains the root, the function is of the form and , which is clearly not constant.
Example 60 ((v)(ii))
Let and with . Then the RDE (1.54) has a solution that is globally attractive in the sense that any solution to (1.2) satisfies , where denotes the total variation norm. Nevertheless, contains no root determining subtree.
**Proof **Since the continuous-time Markov chain that jumps from to with rate is ergodic, the RDE (1.54) has a solution that is globally attractive. On the other hand, if is a finite subtree that contains the root, then if is even and if is odd, so is not constant.
Proof of Lemma 23 By Lemma 21, it suffices to prove that if the RDE (1.54) has a unique solution, then is constant for large enough. By Proposition 19, the RDE (1.54) has a unique solution if and only if . Let [math] and denote the minimal and maximal elements of . By Lemma 57, and converge as to a.s. limits with laws and , respectively. Since is monotone for each , the maps are monotone, and hence
[TABLE]
for all . Since is finite, if the laws of the left- and right-hand sides of (6.20) converge to the same limit, then , proving that is constant for large enough.
6.3 Duality
In this subsection, we prove Lemma 24. For a start, we will generalize quite a bit and assume that is a finite partially ordered set and that is monotone for all , where is equipped with the product partial order. As in Subsection 5.1, we let denote the set of all finite subtrees such that either or . For each , we define as in (1.46), where if .
For any , we let denote the set of all that satisfy
[TABLE]
Lemma 61 (Monotone duality)
For any , , and , one has if and only if there exists a such that and on .
**Proof **Fix . For each , let us write
[TABLE]
Then we need to show that
[TABLE]
The proof is by induction on the number of elements of . If , then is the identity map, , and the statement is trivial.
We will show that if the statement is true for and if , then the statement is also true for . Let and inductively define for as in (1.45). By the induction hypothesis, if and only if
[TABLE]
Here and
[TABLE]
It follows that (6.24) is equivalent to
[TABLE]
which completes the induction step of the proof.
Lemma 62 (Minimal elements)
Assume that for all , there do not exist and minimal elements of resp. such that but . Fix . For any , define as in (6.22) dependent on . Then
[TABLE]
**Proof **By Lemma 61,
[TABLE]
In view of this, it suffices to prove that
[TABLE]
The proof is by induction on the number of elements of . If , then and consists of a single element that has , so (6.29) is satisfied. Assume that (6.29) holds for and let for some . Then (6.25) and the assumption of the lemma imply that (6.29) holds for .
Lemma 63 (Sets with two elements)
Assume that and that for all . Then the assumption of Lemma 62 is satisfied.
**Proof **If then we must have and , so we must show that there do not exist minimal elements of resp. such that . Clearly, has only one minimal element, which is the configuration , so we must show that there does not exist a minimal element of such that . Equivalently, this says that which is satisfied since .
Lemma 64 (Lower and upper solutions)
Assume that is a finite partially ordered set that contains minimal and maximal elements, denoted by 0 and 1. Assume that is monotone for all . Then, for all ,
[TABLE]
**Proof **By Lemma 61, if and only if is not empty. If , then implies , so the events decrease to a limit. We claim that this is the event . Since the restriction of an element to yields an element of , it is clear that
[TABLE]
Conversely, if for each there exists some , then by the finiteness of we can select a subsequence of the that converges pointwise to a limit . Since , this proves the other inclusion. By Lemma 57, it follows that
[TABLE]
By Lemma 61, if and only if contains an element such that for all . Since for each , the zero configuration is the unique minimal element of , we observe that if satisfies for all , then can uniquely by extended to an element of for any by putting for . In view of this, by Lemma 57,
[TABLE]
Proof of Lemma 24 Recall the definition of in (6.21). We observe that is an open subtree of if and only if its indicator function satisfies and . In view of this, (1.98) is just a special case of Lemma 64. Formula (1.99) follows from Lemmas 62 and 63 applied to .
7 Cooperative branching
In this section we prove all results that deal specifically with our running example of a system with cooperative branching and deaths. In Subsection 7.1, we prove Proposition 12 about the bivariate mean-field equation. In Subsection 7.2, we prove Theorem 17 and Lemma 18 about the higher-level mean-field equation. In Subsection 7.3, finally, we prove Lemmas 22 and 25 which illustrate the concepts of minimal root determining subtrees and open subtrees in the concrete set-up of our example.
7.1 The bivariate mean-field equation
In this subsection we prove Proposition 12. We identify a measure on with the function defined as , , etc. We parametrize a measure by the parameters
[TABLE]
We observe that , , and hence . It follows that and determine uniquely and indeed, the map
[TABLE]
is a bijection. Moreover, is concentrated on the diagonal if and only if . A function with values in gives through (7.1) rise to a function taking values in .
Lemma 65 (Change of parameters)
A function with values in solves (1.65) if and only if the associated function solves
[TABLE]
**Proof **As noted in Section 1.6, if solves the bivariate mean-field equation, then its one-dimensional marginals solve the mean-field equation (1.2). Since is symmetric, both marginals are the same. We denote these by . Then and the equation we find for is the same as in (1.36).
We will now obtain the equation for the parameter . By definitions (1.12) and (1.13) we have for any that is the law of the random variable
[TABLE]
where are i.i.d. with law . It follows that
[TABLE]
Similar, but simpler considerations give
[TABLE]
Equation (1.65) in the point now gives
[TABLE]
which simplifies to the second equation in (7.3).
In view of Lemma 65 and the remarks that precede it, Proposition 12 follows from the following proposition.
Proposition 66 (Bivariate differential equation)
For , the equation (7.3) has four fixed points in the space defined in (7.2), which are of the form
[TABLE]
with as in (1.37) and . Solutions to (7.3) started in converge to one of these fixed points, the domains of attraction being
[TABLE]
respectively. For , the equation (7.3) has two fixed points in the space , which are
[TABLE]
with domains of attraction
[TABLE]
For , is the only fixed point in and its domain of attraction is the whole space .
**Proof **In Section 1.3 we have found all fixed points of (7.3) (i) and determined their domains of attraction. It is clear from (7.3) that if is a fixed point of (7.3) (i), then is a fixed point of (7.3), so and for also and are fixed points of (7.3).
If and or if and is arbitrary, then we have seen in Section 1.3 that solutions to (7.3) (i) satisfy as . Since , it follows that also . This proves the statements of the proposition about the domain of attraction of for all values of .
Let
[TABLE]
denote the drift functions of and , respectively. We observe that and for all , which implies that
[TABLE]
It follows that solutions of (7.3) satisfy
[TABLE]
If and or if and , we have seen in Section 1.3 that solutions to (7.3) (i) satisfy as . Combining this with (7.14) and the fact that , we see that .
To complete the proof, we must investigate the long-time behavior of solutions of (7.3) when and . In this case for all and takes values in and solves the differential equation
[TABLE]
It is clear for all is a solution. Since , in view of (7.2), we must prove that all solutions with converge to a nontrivial fixed point. We write
[TABLE]
Since the first term has a positive slope at while the second term has zero slope, we conclude that has a positive slope at . Since solutions to (7.3) do not leave the domain , we must have . Since as , we must have for sufficiently large. These observations imply that the cubic function has three zeros with
[TABLE]
and on and on . It follows that solutions to (7.15) started with satisfy as .
7.2 The higher-level mean-field equation
In this subsection we prove Theorem 17 and Lemma 18. We start with two preparatory lemmas.
Lemma 67 (Convex order and second moments)
Let be a Polish space and let satisfy and . Then .
**Proof **This follows from [MSS18, Lemma 14].
In the next lemma, we use the notation defined in Subsection 1.7.
Lemma 68 (Maximal measure in convex order)
Let be a Polish space and let . Then a measure satisfies if and only if .
**Proof **The condition implies that the first moment measure of is . By (1.74), it follows that , so the statement follows from Lemma 67.
Proof of Theorem 17 It follows from their definition that the measures and solve the higher-level RDE and their first moment measures are , respectively.
By [MSS18, Thm 5], one has if and only if the RTP corresponding to is endogenous. By Theorem 10, endogeny is equivalent to bivariate uniqueness, so we obtain from Proposition 12 that , , and .
Since the second moment measures of are of the form (1.62), we see that the measures from Proposition 12 are indeed the second moment measures of .
By Proposition 13, the second moment measure of solves the bivariate RDE. Since , Lemma 68 tells us that the second moment measure of is different from . It follows that the measure from Proposition 12 is indeed the second moment measure of .
Let be a solution to the higher-level mean-field equation. Assume that . Then Propositions 12 and 13 tell us that converges to one of the fixed points , depending on whether
[TABLE]
By Lemma 68, these four cases correspond exactly to the four domains of attraction in (1.82). To prove that in fact converges to , or , respectively, in each of these cases, by the compactness of , it suffices to prove that if along a sequence of times , then is the right limit point. In the cases (i), (iii) and (iv) this is clear from Lemma 68.
To prove the statement also in case (ii), let be the solution to the higher-level mean-field equation started in . Then (1.74) and Proposition 15 tells us that for all and . Taking the limit , using condition (i) of Theorem 14, we conclude that . Since moreover , we can apply Lemma 67 to conclude that .
This completes the proof for . The cases and are similar, but simpler.
Proof of Lemma 18 We note that if , then
[TABLE]
Combining this with (1.13) and Proposition 13, we see that if solves the higher-level RDE (1.73), then
[TABLE]
must all solve the RDE (1.54). Applying this to which has , we see that
[TABLE]
We first observe that since , we can have only if , which we know is not the case, so we conclude that . If then forces , which we know is not the case, so we conclude that and hence , where the last equality follows from (1.37).
To calculate , we use that , where is the second largest solution of the equation , with defined as in (7.12). The smallest solution of the cubic equation is . Dividing by yields a quadratic equation of which is the smallest solution. Since these are straightforward, but tedious calculations, we omit them.
7.3 Root-determining and open subtrees
In this subsection we prove Lemmas 22 and 25.
Proof of Lemma 22 Since if and if , we see that
[TABLE]
which is if and only if . At the end of Subsection 1.3, we have seen that in our example the RDE (1.54) has a unique solution if and only if . By Lemma 23 this is equivalent to condition (ii) of Lemma 21. Since , Lemma 21 tells is that in our example, conditions (i)–(iii) are equivalent.
We claim that a finite subtree satisfying (1.94) is uniquely determined and in fact implies for all . To prove this, let . Since is finite, if is not empty then we can find some such that for . (Here we take to be the set of all words made from the alphabet .) If , then for all which contradicts the fact that . But if , then (1.94) and the fact that for again imply for all , so we see that must be empty. In particular, this shows that is root determining.
To see that is a minimal root determining subtree, assume that is a smaller one. Then there must be be some such that and either or . (Here we use that by definition, minimal root determining subtrees contain the root, so is not empty.) But then either or . Define inductively by (1.45) with for all . Then . Either is the root or its predecessor satisfies by (1.94), so by induction we see that . Since the all-zero configuration is also an element of , this proves that is not root determining and hence is minimal.
Proof of Lemma 25 We observe that
[TABLE]
Since the set is empty, it has no minimal elements, and hence . On the other hand, for all . In fact, is a set with only one element, the empty word, so and hence the set of its minimal elements is . Now (1.97) with the convention that if says that is an open subtree of if and only if:
- (i)
or for each such that , 2. (ii)
for each such that , 3. (iii)
for each such that ,
which corresponds to the condition in (1.102).
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[AB 05] D.J. Aldous and A. Bandyopadhyay. A survey of max-type recursive distributional equations. Ann. Appl. Probab. 15(2) (2005), 1047–1110.
- 2[ADF 18] L. Andreis, P. Dai Pra, and M. Fischer. Mc Kean–Vlasov limit for interacting systems with simultaneous jumps. Stoch. Anal. Appl. (2018), doi: 10.1080/07362994.2018.1486202.
- 3[Ald 00] D.J. Aldous. The percolation process on a tree where infinite clusters are frozen. Math. Proc. Cambridge Philos. Soc. 128 (2000), 465–477.
- 4[Als 12] G. Alsmeyer. Random recursive equations and their distributional fixed points. Unpublished manuscript (2012), available from https://www.uni-muenster.de/ Stochastik/lehre/WS 1112/Stoch Rek Gleichungen II/book.pdf
- 5[BCH 18] E. Baake, F. Cordero, S. Hummel. Lines of descent in the deterministic mutation-selection model with pairwise interaction. Preprint (2018), 41 pages, ar Xiv:1812.00872.
- 6[Bou 58] N. Bourbaki. Éléments de Mathématique. VIII. Part. 1: Les Structures Fondamentales de l’Analyse. Livre III: Topologie Générale. Chap. 9: Utilisation des Nombres Réels en Topologie Générale. 2iéme éd. Actualités Scientifiques et Industrielles 1045. Hermann & Cie, Paris, 1958.
- 7[BW 97] E. Baake, T. Wiehe. Bifurcations in haploid and diploid sequence space models. J. Math. Biol. 35 (1997) 321–343.
- 8[Cho 69] G. Choquet. Lectures on Analysis. Volume I. Integration and Topological Vector Spaces. Benjamin, London, 1969.
