The $k$-cut model in deterministic and random trees
Gabriel Berzunza, Xing Shi Cai, Cecilia Holmgren

TL;DR
This paper studies the k-cut number in various rooted trees, showing that after rescaling, it converges in distribution or probability, revealing universal behaviors across different tree models.
Contribution
It extends existing results by proving convergence of moments for the k-cut number in conditioned Galton-Watson trees and other tree types, regardless of offspring distribution.
Findings
Moments of k-cut number converge after rescaling in conditioned Galton-Watson trees.
k-cut number converges to a constant in various logarithmic height trees.
Results hold for both deterministic and random tree models.
Abstract
The -cut number of rooted graphs was introduced by Cai et al. as a generalization of the classical cutting model by Meir and Moon. In this paper, we show that all moments of the k-cut number of conditioned Galton-Watson trees converges after proper rescaling, which implies convergence in distribution to the same limit law regardless of the offspring distribution of the trees. This extends the result of Janson. Using the same method, we also show that the k-cut number of various random or deterministic trees of logarithmic height converges in probability to a constant after rescaling, such as random split-trees, uniform random recursive trees, and scale-free random trees.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
The -cut model in deterministic and random trees
Gabriel Berzunza111E-mail: [email protected], Xing Shi Cai222E-mail: [email protected] and Cecilia Holmgren333E-mail: [email protected]
Department of Mathematics, Uppsala University, Sweden
Abstract
The -cut number of rooted graphs was introduced by Cai et al. [12] as a generalization of the classical cutting model by Meir and Moon [30]. In this paper, we show that all moments of the -cut number of conditioned Galton-Watson tree converge after proper rescaling, which implies convergence in distribution to the same limit law regardless of the offspring distribution of the trees. This extends the result of Janson [25]. Using the same method, we also show that the -cut number of various random or deterministic trees of logarithmic height converges in probability to a constant after rescaling, such as random split-trees, uniform random recursive trees, and scale-free random trees.
Key words and phrases: -cut, cutting, conditioned Galton-Watson trees, split trees, preferential attachment trees
1 Introduction and main result
In order to measure the difficulty for the destruction of a resilient network Cai et al. [12] introduced a generalization of the cut model of Meir and Moon [30] where each vertex (or edge) needs to be cut times (instead of only once) before it is destroyed. More precisely, consider that the resilient network is a rooted tree , with vertices. We assume that sibling vertices in are ordered. (Such trees sometimes are referred to as plane trees.) We destroy it by removing its vertices as follows: Step 1: Choose a vertex uniformly at random from the component that contains the root and cut the selected vertex once. Step 2: If this vertex has been cut times, remove the vertex together with the edges attached to it from the tree. Step 3: If the root has been removed, then stop. Otherwise, go to step Step 1. We let denote the (random) total number of cuts needed to end this procedure the -cut number, i.e., models how much effort it takes to destroy the network. (For simplicity, we will omit the subscript and write .) It should be clear that one can define analogously an edge deletion version of the previous algorithm, where one needs to cut an edge times before removing it from the root component. Then, one would be interested in the number of edge cuts needed to isolate the root of .
The case (i.e., the traditional cutting model of Meir and Moon [30]) has been well-studied by several authors. More precisely, Meir and Moon estimated the first and second moment of the -cut number in the cases when is a Cayley tree [30] and a recursive tree [31]. Subsequently, several weak limit theorems for the -cut number have been obtained for Cayley trees (Panholzer [33, 34]), complete binary trees (Janson [24]), conditioned Galton-Watson trees (Janson [25] and Addario-Berry et al. [1]), recursive trees (Drmota et al. [16], Iksanov and Möhle [23]), binary search trees (Holmgren [19]) and split trees (Holmgren [20]). In the general case , the authors in [12] established first moment estimates of for families of deterministic and random trees, such as paths, complete binary trees, split trees, random recursive trees and conditioned Galton-Watson trees. In particular, the authors in [12] have proven a weak limit theorem for when is a path consisting of vertices. More recently, Cai and Holmgren [11] also obtained a weak limit theorem in the case when is a complete binary tree.
In this work, we continue the investigation of this general cutting-down procedure in conditioned Galton-Watson trees and show that , after a proper rescaling, converges in distribution to a non-degenerate random variable. More precisely, let be a non-negative integer-valued random variable such that
[TABLE]
We further assume that the distribution of is aperiodic. This last condition is to avoid unnecessary complications, but our results can be extended to the periodic case. We then consider a Galton-Watson process with (critical) offspring distribution . Let be the family tree conditioned on its number of vertices being , providing that this conditioning makes sense. The main result of this paper is the following. We write to denote convergence in distribution. (In the rest of the paper CRT stands for Continuum Random Tree.)
Theorem 1**.**
Let . Let be a Galton-Watson tree conditioned on its number of vertices being with offspring distribution satisfying (1). Then,
[TABLE]
where is a non-degenerate random variable whose law is determined entirely by its moments: , and for , with
[TABLE]
where and
[TABLE]
Furthermore, if for every , then for every , as .
In the case , Theorem 1 reduces to a having a Rayleigh distribution with density , for . More precisely, one can verify that , for , which are the moments of a random variable with the Rayleigh distribution; in this paper denotes the well-known gamma function. As we mentioned earlier, the case has been shown in [25, Theorem 1.6] (or Addario-Berry et al. [1]). We henceforth assume throughout this paper that .
It is also important to mention that we could not find a simpler expression (in general) for the moments except for some particular instances. For , we have
[TABLE]
Then Theorem 1 provides a proof of [12, Lemma 4.10], where an estimation of the first moment of was first announced but whose proof was left to the reader. One can also compute with the help of Mathematica the second moment of or other particular examples. However, the expressions are too involved and we decided not to include them.
On the other hand, let be i.i.d. leaves of a Brownian CRT and define the vector where and is the total length of the minimal subtree of a Brownian CRT which connects its root and the leaves of ; see [3, Lemma 21] from where one can deduce explicitly the distribution of . From the proof of Theorem 1, we obtain, for , that
[TABLE]
where {\mathchoice{\vbox{\offinterlineskip\halign{#\cr\reflectbox{\displaystyle\vec{}\mkern 4.0mu}\cr\kern-4.30554pt\cr\displaystyle{\bf x}\cr}}}{\vbox{\offinterlineskip\halign{#\cr\reflectbox{\textstyle\vec{}\mkern 4.0mu}\cr\kern-4.30554pt\cr\textstyle{\bf x}\cr}}}{\vbox{\offinterlineskip\halign{#\cr\reflectbox{\scriptstyle\vec{}\mkern 4.0mu}\cr\kern-3.01389pt\cr\scriptstyle{\bf x}\cr}}}{\vbox{\offinterlineskip\halign{#\cr\reflectbox{\scriptscriptstyle\vec{}\mkern 4.0mu}\cr\kern-2.15277pt\cr\scriptscriptstyle{\bf x}\cr}}}}_{q}=(x_{q},\dots,x_{1})\in{\mathbb{R}}_{+}^{q}. This suggests that it ought to be possible to build the random variable by some construction that can be interpreted as the -cut model on the Brownian CRT defined by Aldous [2, 3]. The appearance of the Brownian CRT in this framework should not come as a surprise since it is well-known that if we assign length to each edge of the Galton-Watson tree , then the latter converges weakly to a Brownian CRT as . We believe that this connection can be exploited even more than the one used in this work in order to obtain the precise distribution of . For example, ideas from [6] and [1] could be useful to answer this question.
The approach used in this work consists of implementing an extension of the idea of Janson [25], which was used in [12], in order to study the -cut model on deterministic and random trees. The authors in [12] introduced an equivalent model that allows them to define in terms of the number of records in when vertices are assigned random labels. More precisely, let be a sequence of independent exponential random variables with parameter ; for short. Let , for and . Clearly, has a gamma distribution with parameters , which we denote by Gamma. Imagine that each vertex has an alarm clock and ’s clock fires at times . If we cut a vertex when its alarm clock fires, then due to the memoryless property of exponential random variables, we are actually choosing a vertex uniformly at random to cut. However, this also means that we are cutting vertices that have already been removed from the tree. Thus, for a cut on vertex at time (for some ) to be counted in , none of its strict ancestors can already have been cut times, i.e.,
[TABLE]
When the previous event happens, we say that , or simply , is an -record and let
[TABLE]
where denotes the Iverson bracket, i.e., if the statement is true and otherwise. Let be the number of -records, i.e., . Then, it should be clear that
[TABLE]
where denotes equal in distribution.
Loosely speaking, we then consider the well-known depth-first search walk or contour function of the (ordered) tree as depicted in Figure 1, that is, is “the depth of the -th vertex” visited in this walk; this will be made precise in the next section. As it is well-known (see Aldous [3, Theorem 23 with Remark 2] or [29, Theorem 1]), when is a conditioned Galton-Watson with offspring distribution satisfying (1), we have that
[TABLE]
in , with its usual topology, and where is a standard normalized Brownian excursion. It has been shown in [12, Lemma 2.1] that444For two sequences of non-negative real numbers and such that , we write if as , for some (explicit) constant , where is the depth of the vertex . Let denote the root of . Thus, informally
[TABLE]
when is large. One then expects that
[TABLE]
which coincides with the right-hand side of (3) when . Note that this informal computation suggests that555For two sequences of non-negative real numbers and such that , we write if . , for . As a consequence, Markov’s inequality implies that in probability, as , for . As shown later, by the identity in (6), it would be enough to prove Theorem 1 for instead of .
In the rest of the paper, Section 2 and Section 3 make the above argument precise and extend it to higher moments. This will allow us to use the method of moments for proving Theorem 1. In Section 4, we also apply the same idea to get all moments of the number of records in paths and several types of trees of logarithmic height, e.g., complete binary trees, split trees, uniform random recursive trees and scale-free trees.
2 Preliminary results
The purpose of this section is to establish a general convergence result for the number of -records of a deterministic rooted ordered tree . The results of this section can also be viewed as a generalization of those in Janson [25] and in Cai, et al. [12]. Furthermore, these results will allow us to study the convergence of not only for conditioned Galton-Watson trees, but also for other classes of random trees in Section 4. We start by defining a probability measure through a continuous function in the same spirit as in [25, Theorem 1.9]. Let be an interval. For a function and with , we define
[TABLE]
where are arranged in nondecreasing order. Notice that is symmetric in and that for . Define
[TABLE]
We also consider the functional
[TABLE]
for and . If , we further define, for ,
[TABLE]
where {\mathchoice{\vbox{\offinterlineskip\halign{#\cr\reflectbox{\displaystyle\vec{}\mkern 4.0mu}\cr\kern-4.30554pt\cr\displaystyle{\bf x}\cr}}}{\vbox{\offinterlineskip\halign{#\cr\reflectbox{\textstyle\vec{}\mkern 4.0mu}\cr\kern-4.30554pt\cr\textstyle{\bf x}\cr}}}{\vbox{\offinterlineskip\halign{#\cr\reflectbox{\scriptstyle\vec{}\mkern 4.0mu}\cr\kern-3.01389pt\cr\scriptstyle{\bf x}\cr}}}{\vbox{\offinterlineskip\halign{#\cr\reflectbox{\scriptscriptstyle\vec{}\mkern 4.0mu}\cr\kern-2.15277pt\cr\scriptscriptstyle{\bf x}\cr}}}}_{q}=(x_{q},\dots,x_{1}) and {\mathchoice{\vbox{\offinterlineskip\halign{#\cr\reflectbox{\displaystyle\vec{}\mkern 4.0mu}\cr\kern-4.30554pt\cr\displaystyle{\bf t}\cr}}}{\vbox{\offinterlineskip\halign{#\cr\reflectbox{\textstyle\vec{}\mkern 4.0mu}\cr\kern-4.30554pt\cr\textstyle{\bf t}\cr}}}{\vbox{\offinterlineskip\halign{#\cr\reflectbox{\scriptstyle\vec{}\mkern 4.0mu}\cr\kern-3.01389pt\cr\scriptstyle{\bf t}\cr}}}{\vbox{\offinterlineskip\halign{#\cr\reflectbox{\scriptscriptstyle\vec{}\mkern 4.0mu}\cr\kern-2.15277pt\cr\scriptscriptstyle{\bf t}\cr}}}}_{q}=(t_{q},\dots,t_{1}).
Theorem 2**.**
Let . Suppose that is such that . Then there exists a unique probability measure on with finite moments given by
[TABLE]
Proof.
We only prove uniqueness here. The proof for existence follows along the lines of [25, Proof of Theorem 1.9, Pages 18-19] and details are left to the interested reader. Informally speaking, the idea in [25] for the proof of existence is to build a sequence of functions that satisfy the conditions of Lemma 1 below. Define the function
[TABLE]
By changing the order of integration, we obtain that
[TABLE]
for and . By making the change of variables , we see that
[TABLE]
where . From the inequality , we observe that
[TABLE]
where for the last inequality we have used the fact that , for . The later follows from the symmetry of ; see [25, Lemma 4.1] for a proof. Then, the previous inequality allows us to conclude that
[TABLE]
We conclude that there exists such that , for . Then a probability measure with moments has a finite generating function in a neighbourhood of [math]. Thus, it is well-known that this implies that the probability measure is unique; see, e.g., [18, Section 4.10]. ∎
Consider a rooted ordered tree with root and vertices. We now explain how can be encoded by a continuous function. We define the so-called depth-first search function [2, page 260], such that is the -th vertex visited in a depth-first walk on the tree starting from the root . Note that and always are neighbours, and thus, we extend to by letting, for , to be the one of and that has largest depth (recall that the depth of a vertex is the distance, i.e., number of edges, between to ). Let be the depth of a vertex . We further define the depth-first walk of by
[TABLE]
and extend to by linear interpolation. Thus . See Figure 1 for an example of . Furthermore, we normalize the domain of to by defining
[TABLE]
for . Thus . Note that , for . Moreover,
[TABLE]
We now state the central result of this section, that is, a general limit theorem in distribution for the number of -records of a deterministic rooted tree with vertices. It is important to notice that is a random variable since the -records are random. From now on, we always assume that .
Lemma 1**.**
Suppose that is a sequence of ordered (deterministic) rooted trees, and denote the corresponding normalized depth-first walks by and . Suppose that there exists a sequence of non-negative real numbers with , and a function such that
- (a)
, in , as .
- (b)
, as .
Then, for each ,
[TABLE]
as , where is defined in (26). Moreover, , as , where is a random variable with distribution defined by Theorem 2.
Before proving Lemma 1, we need to establish some preliminary results and to introduce some further notation. For and vertices , let be the number of edges in the subtree of spanned by and its root (i.e., the minimal number of edges that are needed to connect and ). We write and for . We also consider the functional
[TABLE]
for and . We denote by the upper incomplete gamma function of parameter , i.e.,
[TABLE]
Remark 1**.**
Let be an ordered (deterministic) rooted tree with depth-first search walk and the corresponding function . It is not difficult to see that and are connected, in the sense that for ; see [25, Lemma 4.4] for a proof of this fact.
Lemma 2**.**
Let be an ordered (deterministic) rooted tree with vertices. Suppose that there exists a sequence of non-negative real numbers such that and . Let and . Then, for and uniformly for all ,
[TABLE]
where the vertices .
Proof.
Our claim can be shown along the lines of [12, Proof of Lemma 5.1]. ∎
Recall that for two sequences of non-negative real numbers and such that , one writes if .
Lemma 3**.**
Let be an ordered (deterministic) rooted tree with vertices. Suppose that there exists a sequence of non-negative real numbers with , and . Then the moments of are given by
[TABLE]
where
[TABLE]
Proof.
For simplicity, we write for and note that . For , we observe that
[TABLE]
where . Recall that is the indicator that is a -record defined in (5). By the previous identity, we have that
[TABLE]
where {\mathcal{E}}(v_{1},\dots,v_{q})\coloneqq\{E_{1,v_{q}}<\cdots<E_{1,v_{1}}\,\,\text{and}\,\,v_{1},\dots,v_{q}\,\,\text{are all 1-records}\}; recall that are independent random variables with an distribution. To see the last identity, note that each product occurs times with indices permuted and for exactly one of these permutations we have that .
Consider the simple case . Conditioning on , we see that and are both -records, if and only if, the following two events happen:
- (i)
the ancestors of are removed after time ;
- (ii)
the vertices which are ancestors of but not of are removed after time .
Since , we note that the event (i) implies that the vertices which are both the ancestors of and are removed after . Let for . Since the events (i) and (ii) are independent, we have
[TABLE]
Recall that we are assuming . Otherwise, when , the above equality is not entirely correct since is impossible if is an ancestor of ; see [25, Lemma 4.3] for details in the case .
By generalizing the previous argument to , we see that
[TABLE]
where {\mathchoice{\vbox{\offinterlineskip\halign{#\cr\reflectbox{\displaystyle\vec{}\mkern 4.0mu}\cr\kern-4.30554pt\cr\displaystyle{\bf x}\cr}}}{\vbox{\offinterlineskip\halign{#\cr\reflectbox{\textstyle\vec{}\mkern 4.0mu}\cr\kern-4.30554pt\cr\textstyle{\bf x}\cr}}}{\vbox{\offinterlineskip\halign{#\cr\reflectbox{\scriptstyle\vec{}\mkern 4.0mu}\cr\kern-3.01389pt\cr\scriptstyle{\bf x}\cr}}}{\vbox{\offinterlineskip\halign{#\cr\reflectbox{\scriptscriptstyle\vec{}\mkern 4.0mu}\cr\kern-2.15277pt\cr\scriptscriptstyle{\bf x}\cr}}}}_{q}=(x_{q},\dots,x_{1})\in{\mathbb{R}}_{+}^{q}, and . On the one hand, Lemma 2 implies that
[TABLE]
we have used our assumption . On the other hand, Lemma 2 also implies that
[TABLE]
where and
[TABLE]
this estimation can be deduced similarly as the one for the integral . Therefore, the previous estimations and Remark 1 allow us to conclude that
[TABLE]
note that if we had not excluded the root, we would not be able to write the sum as an integral. By making the change of variables , for , we have that
[TABLE]
Finally, our claim follows by induction on and the assumption . ∎
We are now able to establish Lemma 1.
Proof of Lemma 1.
First note that by condition (a) of Lemma 1 and (38), we have . Thus the conditions for Lemma 2 and Lemma 3 are satisfied.
Recall the functions and defined in (49) and (35), respectively. Therefore, notice that we only need to show that
[TABLE]
The above convergence together with Lemma 3 implies that which clearly proves the first claim in Lemma 1. The second claim follows immediately from Theorem 2 and the method of moments.
We henceforth prove the claim in (115). Recall that a sequence of non-negative functions on a measure space with total mass , i.e., , is uniformly integrable if for all and
[TABLE]
We also recall the following useful result on uniformly integrable sequences of functions. Suppose further that almost everywhere as . By [27, Proposition 4.12], we know that
[TABLE]
Then in order to prove (115), it is enough to check the following:
- (i)
The sequence is uniformly integrable on , and
- (ii)
as .
We start by showing (i). Note that for . Then, the assumption (a) implies that and , for every , as . Moreover, the assumption (b) shows that is uniformly integrable on . More generally, for every fixed and , define the function . We then observe that
[TABLE]
as . Thus the result in (116) shows that the sequence is uniformly integrable on . Next notice that the inequality implies that , where is defined in (35). Then the inequality (36) implies that there exists a constant such that . Hence (i) follows by applying [18, Theorem 4.5].
Finally, we verify (ii). Recall that condition (a) implies that , for every , as . Hence, whenever , as . Thus, for , the equation (8), implies that uniformly for as . Then, for and ,
[TABLE]
Note that for there exists such that
[TABLE]
Moreover, note that condition (b) implies that the function on the right-hand side of the inequality is integrable on . Therefore, it should be clear that (ii) follows by the dominated convergence theorem. This finishes the proof. ∎
We can apply similar ideas as in the proofs of Lemma 1 and Lemma 3 to estimate the mean of the number of -records . It is important to mention that we have not tried to estimate higher moments of to obtain a limit theorem in distribution for this quantity. We believe that our methods can be used but the computations will be more involved and we decided not to do it. Furthermore, the next results show that is of smaller order than and hence it will not contribute (in the limit) to the distribution of the -cut number .
Lemma 4**.**
Let be an ordered (deterministic) rooted tree with vertices. Suppose that there exists a sequence of non-negative real numbers with , and . Then, for ,
[TABLE]
Proof.
Note that the case has been proven in Lemma 3. We follow a similar strategy to prove the case . Recall that is the indicator of the event that the vertex is an -record defined in (5). We observe that
[TABLE]
where and . On the one hand, Lemma 2, with , implies that
[TABLE]
On the other hand, Lemma 2, with , also implies that
[TABLE]
where
[TABLE]
this estimate can be deduced similarly as the one for the integral . By recalling that , we conclude from the previous estimations that
[TABLE]
Finally, our claim follows by making the change of variables . ∎
Lemma 5**.**
Suppose that is a sequence of ordered (deterministic) rooted trees. Suppose that there exists a sequence of non-negative real numbers with , , and a function such that satisfies the condition (a) in Lemma 1 and that for ,
[TABLE]
Then,
[TABLE]
Proof.
Notice that the case has been proved in Lemma 1. The proof of the general case follows by a simple adaptation of the argument used in the proof of Lemma 1 for with the use of Lemma 4. One only needs to note that
[TABLE]
3 Proof of Theorem 1
Let be a Galton-Watson tree conditioned on its number of vertices being with offspring distribution satisfying (1). Note that in this case both the -records and the tree are random. Then we study as random variable conditioned on . More precisely, we first choose a random tree . Then we keep it fixed and consider the number of -records. This gives a random variable with distribution that depends on . We have the following lemma that corresponds to [25, Lemma 4.8].
Lemma 6**.**
Let be a Galton-Watson tree conditioned on its number of vertices being with offspring distribution satisfying (1). For . We have that .
Proof.
By an application of the proof of Lemma 4 with (in particular, the equality (2)), we see that
[TABLE]
where denotes the number of vertices at depth in . Notice that
[TABLE]
by the fact that . Since by our assumption (1), [25, Theorem 1.13] implies that for all , for some constant depending on only. Therefore,
[TABLE]
By taking expectation in (118), our claim follows by (119). ∎
We continue by studying the moments of the number of -records . We denote by the (random) probability distribution of given . Define the random variables
[TABLE]
Notice that the moments of are given by . We have the following lemma that corresponds to [25, Lemma 4.9].
Lemma 7**.**
Let be a Galton-Watson tree conditioned on its number of vertices being with offspring distribution satisfying (1). Furthermore, suppose that for every fixed we have that . Then .
Proof.
By an application of Lemma 3 with and (in particular, the equality (62) in its proof), we see that
[TABLE]
where . After a similar computation as in the proof of the inequality (36), one sees that there exists a constant such that
[TABLE]
where . Notice that
[TABLE]
where denotes the number of vertices at depth in . Since for , [25, Theorem 1.13] implies that for all , for some constant depending on and only. Therefore, Minkowski’s inequality implies that
[TABLE]
By taking expectation in (121), we deduce from (122) that
[TABLE]
and our claim follows by induction on . ∎
Let and be the normalized depth-first search walks associated with the conditioned Galton-Watson tree . Note that in this case becomes a random function on . Recall that a remarkable result due to Aldous [3, Theorem 23 with Remark 2] (see also [29, Theorem 1]) shows that
[TABLE]
in , with its usual topology, and where is a standard normalized Brownian excursion. Note that is a random element from ; see for example [8] or [36].
Lemma 8**.**
For , we have that almost surely.
Proof.
One only needs to show that . This follows by computing , for every , from the well-known density function of ; see [8, Chapter II, Equation (1.4)]. ∎
Therefore, Theorem 2 and Lemma 8 imply that there exists almost surely a (unique) measure with moments given by . The next result provides a generalization of [25, Theorem 1.10] and it will be used in the proof of Theorem 1.
Theorem 3**.**
Let be a Galton-Watson tree conditioned on its number of vertices being with offspring distribution satisfying (1). Then
[TABLE]
in the space of probability measures on . Moreover, we have that for every ,
[TABLE]
The convergences in (123), (124) and (125), for all , hold jointly. In particular, if for all , then for all and ,
[TABLE]
Proof.
A simple adaptation of the proof of [25, Lemma 4.7] easily shows that
[TABLE]
in , as . By the Skorohod coupling theorem (see e.g. [27, Theorem 4.30]), we can assume that the trees are defined on a common probability space such that the convergence in (127) holds almost surely. Therefore, the convergences (124) and (125) follow immediately from Lemma 1. It only remains to prove (126). Recall that we assume that for every . By Jensen’s inequality, we notice that for . Hence Lemma 7 implies that . This shows that every moment of the right-hand side of (125) stays bounded as which implies (126). ∎
We are now able to prove Theorem 1.
Proof of Theorem 1.
Lemma 6 establishes that for . As a consequence, Markov’s inequality implies that in probability, as , for . Then, by the identity in (6), it is enough to prove Theorem 1 for instead of . By the definition of and Theorem 3, for any bounded continuous function ,
[TABLE]
Taking expectations, the dominated convergence theorem implies that , as , where has distribution . Suppose that for every . Lemma 7 implies that every moment of stays bounded as which implies the moment convergence in Theorem 1. It remains to identify the moments of (or equivalently ). Notice that
[TABLE]
For , let be independent random variables with the uniform distribution on . Let be the first points in a Poisson process on with intensity , i.e., have joint density function on . It is well-known that , see, e.g., [25, Proof of Lemma 5.1]. Thus by recalling the definition of the function in (35), we see that
[TABLE]
where , and
[TABLE]
Finally, the expression for the moments in Theorem 1 follows by first changing the order of integration in (128) and then by making the change of variables for . ∎
Following the idea of the proof of Theorem 1, we obtain the following convergence of the first moment of the number of -records . This provides a proof of [12, Lemma 4.10].
Lemma 9**.**
Let be a Galton-Watson tree conditioned on its number of vertices being with offspring distribution satisfying (1). For , we have that
[TABLE]
Proof.
The proof follows by a simple adaptation of the argument used in the proof of Theorem 1 by using Lemma 5 (with ), Lemma 6 and Lemma 8. One only needs to note that
[TABLE]
which follows from the well-known density function of ; see [8, Chapter II, Equation (1.4)]. ∎
4 Further applications
In this section, we show that the results obtained in Section 2 can be used and extended to study the -cut model in other families of trees. In this section, let be a rooted tree (maybe random and not necessarily ordered) with vertices and root .
4.1 Paths
Lemma 10**.**
Let be a path with vertices labelled from the root to the leaf. For , we have that , as , where is a non-degenerate random variable whose law is determined entirely by its moments: for , where
[TABLE]
Proof.
By [12, Theorem 1.1], we know that , for , and . Then Markov’s inequality implies that in probability, as , for . Thus, by the identity (6), it is enough to prove our result for instead of . Note that the normalized depth-first search walks and of , defined in (37), are given by and that for . It should be clear that the conditions of Lemma 1 are fulfilled with . Therefore, our result follows from a simple application of Lemma 1. ∎
Remark 2**.**
The convergence in distribution and moments of the -cut number of a path to has been proved in [12, Theorem 1.5] with a very different method. The contribution of Lemma 10 is the formula for computing the -th moment of the limiting variable for all .
4.2 General trees
The next result establishes a limit in distribution for the number of -records of a general (random) rooted tree in the same spirit as in Lemma 1. For , let be a sequence of independent uniformly chosen vertices on . Recall that denotes the number of edges in the subtree of spanned by and its root (i.e., the minimal number of edges that are needed to connect and ). In particular, is the depth of the vertex in . In the sequel, we will often use the notation , where and are two sequences of non-negative real random variables such that , to indicate that .
Theorem 4**.**
Let be a sequence of rooted trees. Suppose that there exists a sequence of non-negative real numbers with , and such that
- (a)
**
- (b)
For every , , where is a sequence of i.i.d. random variables in with no atom at [math].
- (c)
For every ,
Then , as , where is a random variable whose law is determined entirely by its moments: , and for ,
[TABLE]
Proof.
By the assumption (a) and Lemma 3 (in particular, the identity (62)), we see that
[TABLE]
where , and
[TABLE]
with defined in (39). Then we see that
[TABLE]
where . Suppose that we have proven that
[TABLE]
as . Then the result follows by induction on together with the previous convergence.
We henceforth prove the claim in (140). From the result in (116), it is enough to check the following:
- (i)
The sequence is uniformly integrable.
- (ii)
\displaystyle a_{n}^{-q/k}\widehat{H}_{n,q}({\bf u}_{q})\mathds{1}_{\{{\bf u}_{q}\in({\mathbb{T}}_{n}\setminus\{\circ\})^{q}\}}{\,{\buildrel d\over{\rightarrow}}\,}\int_{0}^{\infty}\int_{0}^{x_{1}}\cdots\int_{0}^{x_{q-1}}\exp\left({-\frac{\zeta_{1}x^{k}_{1}+\cdots+\zeta_{q}x_{q}^{k}}{k!}}\right)\;{\rm d}{\mathchoice{\vbox{\offinterlineskip\halign{#\cr\reflectbox{\displaystyle\vec{}\mkern 4.0mu}\cr\kern-4.30554pt\cr\displaystyle\bf x\cr}}}{\vbox{\offinterlineskip\halign{#\cr\reflectbox{\textstyle\vec{}\mkern 4.0mu}\cr\kern-4.30554pt\cr\textstyle\bf x\cr}}}{\vbox{\offinterlineskip\halign{#\cr\reflectbox{\scriptstyle\vec{}\mkern 4.0mu}\cr\kern-3.01389pt\cr\scriptstyle\bf x\cr}}}{\vbox{\offinterlineskip\halign{#\cr\reflectbox{\scriptscriptstyle\vec{}\mkern 4.0mu}\cr\kern-2.15277pt\cr\scriptscriptstyle\bf x\cr}}}}_{q}, as .
We start by showing (i). Since for , we have that
[TABLE]
Hence after a similar computation as in the proof of the inequality (36), one obtains that there exists a constant such that
[TABLE]
Notice that our hypotheses (b) and (c) together with the result in (116) show that the sequence
[TABLE]
is uniformly integrable. Hence (i) follows from [18, Theorem 5.4.5].
Finally, we verify (ii). By making the change of variables , for , we see that
[TABLE]
where , {\mathchoice{\vbox{\offinterlineskip\halign{#\cr\reflectbox{\displaystyle\vec{}\mkern 4.0mu}\cr\kern-4.30554pt\cr\displaystyle{\bf w}\cr}}}{\vbox{\offinterlineskip\halign{#\cr\reflectbox{\textstyle\vec{}\mkern 4.0mu}\cr\kern-4.30554pt\cr\textstyle{\bf w}\cr}}}{\vbox{\offinterlineskip\halign{#\cr\reflectbox{\scriptstyle\vec{}\mkern 4.0mu}\cr\kern-3.01389pt\cr\scriptstyle{\bf w}\cr}}}{\vbox{\offinterlineskip\halign{#\cr\reflectbox{\scriptscriptstyle\vec{}\mkern 4.0mu}\cr\kern-2.15277pt\cr\scriptscriptstyle{\bf w}\cr}}}}_{q}=(w_{q},\dots,w_{1}), and
[TABLE]
with and for . Notice that , as . Thus, condition (b) implies that
[TABLE]
By the Skorohod coupling theorem (see e.g. [27, Theorem 4.30]), we can assume that the previous convergence holds almost surely together with the convergence in condition (b). Notice that for there exists such that
[TABLE]
By condition (c), notice also that the function on the right-hand side is integrable on . Therefore, it should be clear now that (ii) follows by the dominated convergence theorem. This concludes our proof. ∎
The next result establishes an estimate for the mean number of -records of a general (random) rooted tree in the same spirit as in Lemma 5. Furthermore, it shows that is of smaller order than and hence it will not contribute (in the limit) to the distribution of the -cut number . We believe as well that our methods can be used to estimate higher moments and to obtain an analogue result to Theorem 4 for . We have not attempted to do it and the estimation of the mean is enough for our purpose.
Lemma 11**.**
Let be a sequence of rooted trees. Suppose that there exists a sequence of non-negative real numbers with , and such that
- (a)
**
- (b)
, where is a random variable in with no atom at [math].
- (c)
For every ,
Then, for ,
[TABLE]
Proof.
By the assumption (a) and Lemma 4 (in particular, the identity (2)), we see that
[TABLE]
Hence
[TABLE]
Therefore, our result follows by proving that
[TABLE]
where the last integral is equal to the right-hand side of (143). Note that the case has been proved in Theorem 4. The proof of the general case follows by a simple adaptation of the argument used in the proof of Theorem 4 for and details are left to the reader. ∎
The next lemma provides a useful way to verify condition (c) in Theorem 4.
Lemma 12**.**
Let be a rooted tree. Suppose that there exists a sequence of non-negative real numbers with , and such that for every ,
[TABLE]
where is a sequence of i.i.d. random variables in with no atom at [math] such that . Furthermore, assume that for every there exists such that for all
[TABLE]
where denotes the number of vertices a depth in . Then the condition (c) in Theorem 4 is satisfied
Proof.
For simplicity, we introduce the notation and , for . Consider such that for the property in (144) is satisfied. Define the function given by on , on , and linear on . Since we observe that
[TABLE]
Further, we note that , almost surely, as . In order to show that condition (c) in Theorem 4 is fulfilled, it is enough to check that
[TABLE]
Notice that
[TABLE]
Since , it is not difficult to see that
[TABLE]
where we have used Jensen’s inequality to obtain the second inequality. Finally, by our choice of (recall assumption (144)), we observe that
[TABLE]
This clearly implies (145) and concludes our proof. ∎
Similarly, we also provide a useful way to verify condition (c) in Lemma 11.
Lemma 13**.**
Let be a rooted tree. Suppose that there exists a sequence of non-negative real numbers with , and such that the condition (b) in Lemma 11 holds with a random variable satisfying for every . Furthermore, assume that for every there exists such that for all
[TABLE]
where denotes the number of vertices at depth in . Then the condition (c) in Lemma 11 is fulfilled.
Proof.
It should be clear that this can be shown along the lines of the proof of Lemma 12, and therefore, we omit its proof. ∎
4.3 Trees of logarithmic height
Natural examples of trees that fulfil the conditions of Theorem 4 are the class of random trees with logarithmic height, i.e., trees such that . For instance, random split trees, uniform random recursive trees, scale-free random trees and mixtures of complete regular trees.
4.3.1 Complete binary trees
Let be a complete binary tree with vertices, i.e., its height is . Recall that has vertices at height , and vertices of height , moreover, the vertices of height have leftmost positions among the possible ones; see, e.g., [28, Page 401]. We use the notation for the logarithm with base of . It should be clear that condition (a) in Theorem 4 is satisfied with . Furthermore, one readily checks that , as . By a simple application of [5, Corollary 1], this implies that condition (b) in Theorem 4 is satisfied with . Notice that each vertex in has at most children. Then it should be clear that condition (c) of Theorem 4 follows from Lemma 12 since for . Therefore, Theorem 4 implies that , as , where is the random variable whose law is determined entirely by its moments: , and for ,
[TABLE]
It should be clear that Lemma 11 and Lemma 13 imply that for . Therefore, by the identity (6) and the Markov’s inequality, , as . However, it follows from the next lemma that . Therefore, we actually have
[TABLE]
Remark 3**.**
As Theorem 1.1 of [11] shows, , after proper shifting and rescaling, also converges to a non-degenerate limit distribution with an infinite mean. Thus it is not possible to derive the result in [11] with the method of moments which we use to derive Theorem 1 for conditioned Galton-Watson trees. The same is true for split trees, random recursive trees and scale-free trees.
Lemma 14**.**
For , we have that
[TABLE]
Proof.
By making the change of variables , for , we notice that the integral at the right-hand side of (155) is equal to
[TABLE]
To see the last identity, we notice that the integral at the left-hand side is simply the probability that , where are independent random variables, which is equal to since each order of is equally likely. ∎
4.3.2 Split trees
The class of random split trees was first introduced by Devroye [13] to encompass many families of trees that are frequently used in algorithm analysis, e.g., binary search trees and tries. Its exact construction is somewhat lengthy and we refer readers to either the original algorithmic definition in [13, 21] or the more probabilistic version in [10, Section 2]. Informally speaking, a split tree is constructed by first distributing balls among the vertices of an infinite -ary tree () and then removing all subtrees without balls. Each vertex in the infinite -ary tree is given a random non-negative split vector such that and , drawn independently from the same distribution. These vectors affect how balls are distributed. In the study of split-trees, the following condition of is often assumed (see, e.g., Holmgren [21]):
Condition A. The split vector is permutation invariant. Moreover, , and that is non-lattice.
Set . Devroye [13] showed that , that is, condition (a) in Theorem 4 with . Berzunza et al. [7, Lemma 5 and Corollary 1] have shown that , as . By a simple application of [5, Corollary 1], this implies that condition (b) in Theorem 4 is satisfied with . Notice that each vertex in has at most children. Then it should be clear that condition (c) of Theorem 4 follows from Lemma 12 since for . Therefore, Theorem 4 implies that , as , where is the random variable whose law is determined entirely by its moments given in (155). Furthermore, Lemma 11 and Lemma 13 imply that for . Therefore, by the identity (6) and the Markov’s inequality,
[TABLE]
4.3.3 Uniform random recursive trees
A uniform random recursive tree is a random tree of vertices constructed recursively as follows: let be the tree of a single vertex labelled , given , choose a vertex in uniformly at random and attach a vertex labelled to the selected vertex as its child, which give . The uniform random recursive tree is one of the most studied random tree models. They appear for instance as simple epidemic models, or in computer science as data structures. We refer to [15, Chapter 6] for background. Theorem 6.32 in [15] shows that , that is, condition (a) in Theorem 4 is satisfied with . From the results of Dobrow [14] (see also [15, Section 2.5.5]), it is not difficult to see that , as . By a simple application of [5, Corollary 1], this implies that condition (b) in Theorem 4 is satisfied with . By [17, Equation (11)],
[TABLE]
uniformly for and , for all . Then it should be clear that condition (c) of Theorem 4 follows from Lemma 12. Therefore, Theorem 4 implies that , as , where is the random variable whose law is entirely determined by its moments given in (155). Furthermore, Lemma 11 and Lemma 13 imply that for . Therefore, by the identity (6) and the Markov’s inequality,
[TABLE]
4.3.4 Scale-free random trees
Scale-free random trees form a family of random trees that grow following a preferential attachment algorithm, and are commonly used to model complex real-world networks; see Móri [32]. A scale-free random tree is a random tree of vertices constructed recursively as follows: Fix a parameter , and start from the tree that consists in a single edge connecting the vertices labelled and . Suppose that has been constructed for some , and for every , denote by the degree of the vertex in . Then conditionally given , is built by adding an edge between the new vertex and a vertex in chosen at random according to the law
[TABLE]
The standard preferential attachment tree (also known as plane-oriented recursive tree) was made popular by Barabási and Albert [4] and it corresponds to the choice of . On the other hand, if one lets , then the algorithm yields a uniform random recursive tree. Janson [26] showed that scale-free random trees can also be viewed as split trees with the branching factor .
Pittel [35] showed that , that is, condition (a) in Theorem 4 is satisfied with , where . From the results of Borovkov and Vatutin [9] (see the bibliography therein for further references), it is not difficult to see that , as . By a simple application of [5, Corollary 1], this implies that condition (b) in Theorem 4 is satisfied with . Hwang [22, Equation 8] showed that, for , i.e., for the standard preferential attachment tree,
[TABLE]
uniformly for for all . Thus by an argument similar to that for uniform random recursive trees, we have for ,
[TABLE]
Open problem. To apply Theorem 4 to general scale-free trees, we need an estimate of for all , which is currently missing in the literature. Thus we leave it as an open problem that an estimation similar to (157) holds for all . This would imply that the convergence in (158) holds for all scale-free trees.
Remark 4**.**
In all previous examples of Section 4.3, the limit distributions found here are all degenerate. However, we conjecture that another normalization should yield to non-degenerate limits. This is known to be the case, when , for complete binary trees (Janson [24]), recursive trees (Drmota et al. [16], Iksanov and Möhle [23]), binary search trees (Holmgren [19]) and split trees (Holmgren [20]). In the general case , Cai and Holmgren [11] obtained also a weak limit theorem in the case of complete binary trees suggesting that our conjecture must be true.
4.3.5 Mixture of regular trees
Our next example provides a method to build trees that fulfill the conditions of Theorem 4 where the random variables in the hypotheses are not constants. Basically, the procedure consists of gluing trees which satisfy the assumptions of Theorem 4. In this example, we consider a mixture of complete regular trees but one may consider other families of trees as well. For a fixed integer , let denote a positive sequence of integers. Next, for , let be a function with . Let be a complete -regular tree with height . Since there are vertices at distance from the root, its size is given by
[TABLE]
In particular, one can check that each tree fulfills the assumptions in Theorem 4 with and ; note that condition (c) in Theorem 4 follows from Lemma 12 and the fact that the number of descendants of each vertex is bounded. Now imagine that we merge all the regular trees into one common root. This leads us to a new tree of size . Assume further that , as . Then, we observe that the probability that a vertex of chosen uniformly at random belongs to the tree converges when to . Then, one readily checks that this new tree satisfies the hypotheses in Theorem 4 with and are i.i.d. random variables uniformly distributed in the set . To see this, note that the probability that a uniform chosen vertex of belongs to converges to .
Acknowledgements.
This work is supported by the Knut and Alice Wallenberg Foundation, a grant from the Swedish Research Council and The Swedish Foundations’ starting grant from Ragnar Söderbergs Foundation.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] L. Addario-Berry, N. Broutin, and C. Holmgren, Cutting down trees with a Markov chainsaw , Ann. Appl. Probab. 24 (2014), no. 6, 2297–2339. MR 3262504
- 2[2] D. Aldous, The continuum random tree. II. An overview , Stochastic analysis (Durham, 1990), London Math. Soc. Lecture Note Ser., vol. 167, Cambridge Univ. Press, Cambridge, 1991, pp. 23–70. MR 1166406
- 3[3] D. Aldous, The continuum random tree. III , Ann. Probab. 21 (1993), no. 1, 248–289. MR 1207226
- 4[4] A.-L. Barabási and R. Albert, Emergence of Scaling in Random Networks , Science 286 (1999), no. 5439, 509–512 (en).
- 5[5] J. Bertoin, Almost giant clusters for percolation on large trees with logarithmic heights , J. Appl. Probab. 50 (2013), no. 3, 603–611 (EN).
- 6[6] J. Bertoin and G. Miermont, The cut-tree of large Galton-Watson trees and the Brownian CRT , Ann. Appl. Probab. 23 (2013), no. 4, 1469–1493. MR 3098439
- 7[7] G. Berzunza, X. Shi Cai, and C. Holmgren, The asymptotic non-normality of the giant cluster for percolation on random split trees , ar Xiv e-prints (2019), ar Xiv:1902.08109.
- 8[8] R. M. Blumenthal, Excursions of Markov processes , Probability and its Applications, Birkhäuser Boston, Inc., Boston, MA, 1992. MR 1138461
