This paper analyzes how rooted plane trees are affected by four different fringe reduction operations, providing exact and asymptotic results for node survival, including limit theorems.
Contribution
It introduces a generalized framework for tree pruning operations and derives precise asymptotic formulas for node counts after repeated reductions.
Findings
01
Exact expressions for expected surviving nodes
02
Asymptotic expansions for variance
03
Central limit theorems for node counts
Abstract
Rooted plane trees are reduced by four different operations on the fringe. The number of surviving nodes after reducing the tree repeatedly for a fixed number of times is asymptotically analyzed. The four different operations include cutting all or only the leftmost leaves or maximal paths. This generalizes the concept of pruning a tree. The results include exact expressions and asymptotic expansions for the expected value and the variance as well as central limit theorems.
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Full text
Fringe Analysis of Plane Trees Related to Cutting and Pruning
Benjamin Hackl
,
Clemens Heuberger
Institut für Mathematik,
Alpen-Adria-Universität Klagenfurt, Universitätsstraße
65–67, 9020 Klagenfurt, Austria
Rooted plane trees are reduced by four different operations on the
fringe. The number of surviving nodes after reducing the tree
repeatedly for a fixed
number of times is asymptotically analyzed. The four different operations include
cutting all or only the leftmost leaves or maximal paths. This generalizes the concept
of pruning a tree.
The results include exact expressions and asymptotic expansions
for the expected value and the variance as well as central limit theorems.
Key words and phrases:
Plane trees, pruning, tree reductions, central limit theorem, Narayana polynomials
2010 Mathematics Subject Classification:
05A16; 05C05 05A15 05A19 60C05
B. Hackl and C. Heuberger are supported by the Austrian
Science Fund (FWF): P 24644-N26 and by the Karl Popper Kolleg
“Modeling-Simulation-Optimization” funded by the Alpen-Adria-Universität Klagenfurt
and by the Carinthian Economic Promotion Fund (KWF)
H. Prodinger is supported by an incentive grant of the National Research
Foundation of South Africa. Part of this author’s work was done while he visited
Academia Sinica. He thanks the Institute of Statistical Science for its hospitality.
This is the full version of the extended abstract [14].
1. Introduction
Plane trees are among the most interesting elementary combinatorial objects; they
appear in the literature under many different names such as ordered trees, planar trees,
planted plane trees, etc. They have been analyzed under various aspects, especially
due to their relevance in Computer Science. Two particularly well-known quantities are
the height, since it is
equivalent to the stack size needed to explore binary (expression) trees, and the pruning
number (pruning index), since it is equivalent to the register function (Horton-Strahler
number) of binary trees. Several results for the height of plane trees can be found
in [3, 9, 23], for the register
function, we refer to [4, 11, 19], and for results on
the connection between the register function and the pruning number to
[4, 27].
Reducing (cutting-down) trees has also been a popular research theme during the last decades
[17, 21, 22]: according to a certain probabilistic model, a given
tree is reduced until a certain condition is satisfied (usually, the root is isolated).
In the present paper, the point of view is slightly different, as we
reduce a tree in a
completely deterministic fashion at the leaves until the tree has no
more edges. All these reductions take place on the fringe, meaning that only (a subset of)
leaves (and some adjacent structures) are removed. We consider four different models:
–
In one round, all leaves together with the corresponding edges are
removed (see Section 2).
–
In one round, all maximal paths (linear graphs), with the leaves on one end, are
removed (see Section 3). This process is called pruning.
–
A leaf is called an old leaf if it is the leftmost sibling of its parents. This
concept was introduced in [2]. In one round,
only old leaves are removed (see Section 4).
–
The last model deals with pruning old paths. There might be several
interesting models related to this; the one we have chosen here is that in one round
maximal paths are removed, under the condition that each of their nodes is the leftmost
child of their parent node (see Section 5).
The four tree reductions are illustrated in
Figure 1. We describe these reductions
more formally in the corresponding sections.
The first model is clearly related to the height of the plane tree, and the second one to
the Horton-Strahler number via the pruning index [27, 24]. While there are no surprises about the number of rounds
that the process takes here, we are interested in how the fringe develops. The number of leaves
and nodes altogether in the remaining tree after a fixed number of reduction rounds is
the main parameter analyzed in this paper.
For the sake of simplicity, we will use the same notation for each of the following
reduction analyses. In case we need to compare objects from two different sections, we
will distinguish them by adding appropriate superscripts.
The random variable Xn,r models the tree size after reducing a plane tree
of size n (that is chosen uniformly at random among all trees with n nodes) r-times
iteratively according to one of our four reductions. If a tree does not “survive” r
rounds of reductions, we consider the size of the resulting tree to be [math]. In particular,
for r=0, the given plane tree is not changed and Xn,0=n.
As we will see, a key aspect of the analysis of Xn,r is the translation of the
algorithmic description of the reduction into an operator Φ that acts on the corresponding
generating functions.
In Section 2, the reduction cutting away all leaves from the tree is
discussed. Section 2.1 contains all necessary auxiliary
concepts required in order to study the r-fold application of this
reduction. In
Section 2.2, we determine the operator Φ acting on the
corresponding generating function explicitly and prove some direct consequences. Then, in
Section 2.3 we carry out the analysis of the behavior of
Xn,r by computing explicit expressions and asymptotic expansions for the factorial
moments of Xn,r as well as a central limit theorem.
Section 3 is devoted to the study of the
reduction that cuts away all
paths. As we will see in Section 3.1, we can actually obtain all
results regarding the behavior of Xn,r as consequences of the corresponding results
in Section 2. In Section 3.2, we analyze the
asymptotic behavior of the expected number of paths required to construct a plane tree of
size n, i.e. the number of paths we can cut away until the tree cannot be reduced any
further.
Sections 4 and 5 are devoted to the
analysis of reductions removing only leftmost leaves and leftmost paths from the tree,
respectively. In particular, in Section 5.3, we study the total
number of old paths that can be removed from a tree until it cannot be reduced any
further.
On a general note, the computationally heavy parts of this paper have been carried out
with the open-source computer mathematics system SageMath [5], and
the corresponding worksheets are available for download. In particular, there are the
following files:
In this part of the paper we investigate the effect of the tree
reduction that cuts away all
leaves from a given tree. However, before we can do so, we require some auxiliary concepts,
which we discuss in this section. Most importantly, we need a generating function counting
plane trees with respect to their number of inner nodes and leaves, which is intimately
linked to Narayana numbers. The generating function presented in the following proposition
is actually well-known (see, e.g. [12, Example III.13]).
Proposition 2.1**.**
The generating function T(z,t) which enumerates plane trees with respect to their
internal nodes (marked by the variable z) and leaves (marked by t) is given
explicitly by
[TABLE]
Proof.
This can be obtained directly from the symbolic equation describing the
combinatorial class of plane trees T, which is illustrated in
Figure 2. In particular, □ and
represent leaves and internal nodes, respectively.
The symbolic equation translates into the functional equation
[TABLE]
which yields (1) after solving for T(z,t) and choosing the
appropriate branch.
∎
In the context of plane trees, the so-called Narayana numbers count the
number of trees with a given size and a given number of leaves
(cf. [8]). As these numbers will appear throughout the
entire paper, we introduce them formally and investigate some properties within the
following statements.
Definition**.**
The Narayana numbers are defined as
[TABLE]
for 1≤n and 1≤k≤n, and N0,0=1. All other indices give
Nn,k=0. Combinatorially, for n≥1 the Narayana number Nn,k corresponds to the number of
plane trees with n edges (i.e. n+1 nodes) and k leaves.
The Narayana polynomials are defined as
[TABLE]
for n≥1 and N0(x)=1, and the associated Narayana polynomials are defined
as
[TABLE]
for n≥0. Note that
[TABLE]
is the nth Catalan number.
Remark*.*
The generating function z1T(z,z)=2z1−1−4z enumerates Catalan
numbers, see [7, Theorem 3.2], and
the generating function T(z,tz) enumerates Narayana numbers
[TABLE]
We will frequently use this relation in the form
[TABLE]
Furthermore, it is easily checked that T(z,tz) satisfies the ordinary differential
equation
[TABLE]
Extracting the coefficient of zn+2 then yields the recurrence relation
[TABLE]
for n≥0.
The following proposition gives another useful property of associated Narayana polynomials.
Proposition 2.2**.**
Let n≥0, then we have the relation
[TABLE]
Proof.
This relation follows from extracting the coefficient of zn+1 from the identity T(tz,z)=T(z,tz)+(1−t)z with the help of (3).
While it is straightforward to prove that the identity is valid by means of
algebraic manipulation, we also give a combinatorial proof.
From a combinatorial point of view, both generating functions T(tz,z) and T(z,tz)
enumerate plane trees where z marks the tree size, the only difference is that
the variable t enumerates inner nodes in T(tz,z) and leaves in T(z,tz). We want
to show that for trees of size n≥2, these two classes are equal, resulting in
T(tz,z)−z=T(z,tz)−tz.
To construct an appropriate bijection between the class of trees of size n with k
leaves and the class of trees of size n with k inner nodes we need to have a closer
look at the well-known rotation
correspondence [12, I.5.3], which is a bijection
between plane trees of size n and binary trees with n−1 inner nodes. In fact, the
leaves in the binary tree are strongly related to the leaves and inner nodes of the
original tree:
–
Left leaves in the binary tree are only attached to those nodes whose
companions in the plane tree have no children, i.e., to those who correspond to
leaves in the plane tree.
–
Right leaves, on the other hand, are attached to nodes whose companion nodes
in the plane tree have no sibling right of them. This means that for every node with
children, i.e., for every inner node, there is precisely one rightmost child and thus
precisely one right leaf in the binary tree.
The bijection between the two tree classes can now be described as follows: given some
tree of size n and k leaves, apply the rotation correspondence in order to obtain a
binary tree. Then mirror the binary tree by swapping all left and right
children. Transform this mirrored tree back by means of the inverse rotation
correspondence, and the result is a plane tree of size n and k inner nodes as
mirroring the binary tree swapped the number of left and right leaves in the tree. This
proves the proposition.
∎
Derivatives of the associated Narayana polynomials defined above will occur within the
analysis of a reduction model later, which is why we compute some special values in the
following proposition.
Proposition 2.3**.**
Evaluating the rth derivative of the associated Narayana polynomials at 1, i.e. N~n(r)(1), gives the number of trees with n+1 nodes where precisely r
leaves are selected and labeled from 1 to r. In particular, for n≥1 we have
[TABLE]
Proof.
The combinatorial interpretation follows immediately by rewriting
[TABLE]
where we used the notion kr=k(k−1)⋯(k−r+1) for the falling
factorial.
Explicit values can be obtained by differentiating (2) r-times with
respect to t, then setting t=1 and extracting the coefficient of zn+1.
∎
Remark*.*
By the combinatorial interpretation of Proposition 2.3 we
find that N~n′(1)=21(n2n) enumerates
the number of leaves, summed over all trees with n+1 nodes. At the same time, as there
are Cn=n+11(n2n) such trees, the total number of nodes in these
trees is (n2n). This implies that exactly half of all nodes in all trees of given size
are leaves!
In fact, this interpretation also motivates a second, purely combinatorial proof of the
explicit value of N~n′(1): the bijection correspondence maps trees of size
n+1 to binary trees with n inner nodes. In the proof of
Proposition 2.2 we already observed that the number of left
leaves in the binary tree obtained from the rotation correspondence is equal to the
number of leaves in the plane tree.
As binary trees with n inner nodes have n+1 leaves, and as there are Cn binary
trees with n inner nodes, the total number of leaves in all binary trees with n
inner nodes is (n2n). By symmetry, there have to be equally many left leaves
as right leaves—which proves that there are 21(n2n) left leaves,
and thus N~n′(1)=21(n2n).
In addition to the polynomials related to the Narayana numbers, there is another well-known sequence
of polynomials that will occur throughout this paper.
Definition**.**
The Fibonacci polynomials are recursively defined by
[TABLE]
for r≥2 and F0(z)=0, F1(z)=1.
For many identities involving Fibonacci numbers, there is an analogous statement for
Fibonacci polynomials. The identity presented in the following proposition will be used
repeatedly throughout this paper.
Proposition 2.4** (d’Ocagne’s Identity).**
Let s,r∈Z≥0 where s≥r. Then we have
[TABLE]
Proof.
The left-hand side of (6) can be expressed as the determinant
of (Fr+1(z)Fs+1(z)Fr(z)Fs(z)). At
the same time, for r, s≥1 we can write
[TABLE]
Combining these two observations yields
[TABLE]
which proves the statement.
∎
Observe that setting s=r+1 in (6) yields the identity
[TABLE]
which we will make heavy use of later on.
An important tool in the context of plane trees is the substitution z=u/(1+u)2,
which allows us to write some expressions in a manageable form. It
is easy to check that with this
substitution, we can write Fibonacci polynomials as
[TABLE]
The fact that this substitution also works for Fibonacci polynomials is not that
surprising, as zFr(−z)/Fr+1(−z) is the generating function of
plane trees with height ≤r (see [3]).
2.2. Leaf-Reduction and the Expansion Operator
The reduction ρ:T∖{\leavevmodeto7.47pt\vboxto7.47pt\pgfpicture\makeatletter\lower-3.73589ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@moveto3.53589pt0.0pt\pgfsys@curveto3.53589pt1.95284pt1.95284pt3.53589pt0.0pt3.53589pt\pgfsys@curveto-1.95284pt3.53589pt-3.53589pt1.95284pt-3.53589pt0.0pt\pgfsys@curveto-3.53589pt-1.95284pt-1.95284pt-3.53589pt0.0pt-3.53589pt\pgfsys@curveto1.95284pt-3.53589pt3.53589pt-1.95284pt3.53589pt0.0pt\pgfsys@closepath\pgfsys@moveto0.0pt0.0pt\pgfsys@fillstroke\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpicture}→T we want to
investigate now can be explained very easily. For any tree τ∈T∖{\leavevmodeto7.47pt\vboxto7.47pt\pgfpicture\makeatletter\lower-3.73589ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@moveto3.53589pt0.0pt\pgfsys@curveto3.53589pt1.95284pt1.95284pt3.53589pt0.0pt3.53589pt\pgfsys@curveto-1.95284pt3.53589pt-3.53589pt1.95284pt-3.53589pt0.0pt\pgfsys@curveto-3.53589pt-1.95284pt-1.95284pt-3.53589pt0.0pt-3.53589pt\pgfsys@curveto1.95284pt-3.53589pt3.53589pt-1.95284pt3.53589pt0.0pt\pgfsys@closepath\pgfsys@moveto0.0pt0.0pt\pgfsys@fillstroke\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpicture} we obtain the reduced tree ρ(τ) simply by
removing all leaves from τ. Repeated application of ρ to a tree is illustrated
in Figure 3.
It is easy to see that this operator is certainly not injective: there are many trees that
reduce to the same tree. However, it is also easy to see that ρ is surjective, as we
can always construct an expanded tree that reduces to any given tree τ by attaching
leaves to all leaves of τ.
In fact, the operator ρ−1 mapping trees τ∈T to the set of
preimages is easier to handle from a combinatorial point of view. This is because we can model
the expansion of trees in the language of generating functions.
Proposition 2.5**.**
Let F⊆T be a family of plane trees with bivariate
generating function f(z,t), where z marks inner nodes and t marks leaves. Then the
generating function of ρ−1(F),
the family of trees whose reduction is in F, is given by
[TABLE]
Proof.
It is obvious from a combinatorial point of view that the operator Φ has to be
linear. Thus we only have to determine how a tree represented by an arbitrary monomial
zntk, i.e. a tree τ with n inner nodes and k leaves, is expanded.
In order to obtain all possible tree expansions from τ, we perform the following
operations: first, all leaves of τ are expanded by appending a nonempty sequence of leaves to
each of them. Then, every inner node of τ is expanded by appending (possibly empty)
sequences of leaves between two of its children as well as before the first and after
the last one.
In terms of generating functions, expanding the leaves of τ corresponds to
replacing t by zt/(1−t). Expanding the inner vertices is a bit more involved: by considering
that every inner node has precisely one more available position to attach new leaves
than it has children we find that there are 2n+k−1 available positions overall within
τ. Therefore we find
The generating function for plane trees T(z,t) satisfies the functional
equation
[TABLE]
Proof.
This follows directly from the fact that ρ:T∖{\leavevmodeto7.47pt\vboxto7.47pt\pgfpicture\makeatletter\lower-3.73589ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@moveto3.53589pt0.0pt\pgfsys@curveto3.53589pt1.95284pt1.95284pt3.53589pt0.0pt3.53589pt\pgfsys@curveto-1.95284pt3.53589pt-3.53589pt1.95284pt-3.53589pt0.0pt\pgfsys@curveto-3.53589pt-1.95284pt-1.95284pt-3.53589pt0.0pt-3.53589pt\pgfsys@curveto1.95284pt-3.53589pt3.53589pt-1.95284pt3.53589pt0.0pt\pgfsys@closepath\pgfsys@moveto0.0pt0.0pt\pgfsys@fillstroke\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpicture}→T is surjective, i.e. ρ−1(T)=T∖{\leavevmodeto7.47pt\vboxto7.47pt\pgfpicture\makeatletter\lower-3.73589ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@moveto3.53589pt0.0pt\pgfsys@curveto3.53589pt1.95284pt1.95284pt3.53589pt0.0pt3.53589pt\pgfsys@curveto-1.95284pt3.53589pt-3.53589pt1.95284pt-3.53589pt0.0pt\pgfsys@curveto-3.53589pt-1.95284pt-1.95284pt-3.53589pt0.0pt-3.53589pt\pgfsys@curveto1.95284pt-3.53589pt3.53589pt-1.95284pt3.53589pt0.0pt\pgfsys@closepath\pgfsys@moveto0.0pt0.0pt\pgfsys@fillstroke\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpicture}.
∎
Corollary 2.7**.**
The Narayana numbers satisfy the identity
[TABLE]
for n≥2, k≥1.
Proof.
The result follows from extracting the coefficient of zntk from both sides of (10).
∎
Remark*.*
Note that in [1] there is a very short proof
based on Dyck paths for this identity, and actually the
argumentation there is strongly related to our tree reduction here: by the well-known
glove bijection, it is easy to see that cutting away all leaves of a plane tree
translates into removing all peaks within the corresponding Dyck path.
We are now interested in determining a multivariate generating function enumerating
plane trees with respect to the tree size as well as the size of the tree after applying
the tree reduction ρ a fixed number of times.
Proposition 2.8**.**
Let r∈N0. The trivariate generating function Gr(z,vI,vL)=GrL(z,vI,vL) enumerating
plane trees whose leaves can be cut at least r-times, where z marks the tree size,
and vI and vL mark the number of inner nodes and leaves of the r-fold cut
tree, respectively, is given by
[TABLE]
Proof.
First, observe that formally, we can obtain the generating function enumerating
plane trees that can be reduced at least r-times with respect to their size by
considering Φr(T(z,t))∣t=z. If we additionally track some size parameter like
the number of inner nodes or the number of leaves
before the expansion by marking their size with vI and vL, then we obtain a generating function
for plane trees that can be reduced at least r-times where vI and vL mark
inner nodes and leaves in the original tree and z marks the size of the expanded
tree. From a different point of view,
z marks the size of the original tree and vI and vL mark the number of
inner nodes and leaves of the r-fold
reduced tree, meaning that we have
As Φ is linear, we are mainly interested in finding a representation for
Φr(zntk)∣t=z. To do so, we consider the strongly related operator
[TABLE]
It is easy to prove by induction that iterative application of Φ can be expressed
in terms of Ψ via
[TABLE]
which means that we can concentrate on the investigation of the linear operator
Ψ. Note that Ψ is also multiplicative, meaning that Ψr(zntk)=Ψr(z)nΨr(t)k.
Again by induction, it is easy to show that the recurrences
[TABLE]
hold for r≥0. Now define fr:=Ψr(t)∣t=z and gr:=Ψr(z)∣t=z. We prove by induction that these quantities can be
represented by means of Fibonacci polynomials as
[TABLE]
for r≥0, where the recurrence relations from above, the
identity (7) as well as the relation
[TABLE]
for r≥0 play integral parts in the proof.
With these explicit representations, we find
[TABLE]
Then, using (8) and rewriting the right-hand side
of (12) in terms of u, where z=u/(1+u)2,
yields
[TABLE]
By linearity, we are allowed to apply Φr to every summand in the power
series expansion of f(z,t) separately—which proves the statement.
∎
The generating function Gr(z,v,v) tells us how many nodes (marked by v) are
still in the tree after r reductions. For the sake of brevity we set Gr(z,v):=Gr(z,v,v). It is completely described in
terms of the function T(z,t), although in a non-trivial way. Results
about moments and the limiting distribution can be extracted from this explicit form.
With the help of the mathematics software system SageMath [5], the
generating function Gr(z,v) can be expanded. For small values of r, the first few
summands are
[TABLE]
[TABLE]
[TABLE]
As announced in the introduction, we investigate the behavior of the random
variable Xn,r=Xn,rL that models the number of nodes which are left
after reducing a random tree τ with n nodes r-times. In case the r-fold
application of ρ to τ is not defined, we consider the resulting tree size to be
[math], i.e., the random variable Xn,r=0 for these trees. Note that the tree τ is
chosen uniformly at random among all trees of size n. With the help of the generating
function Gr(z,v) we are able to express the probability generating function of Xn,r as
[TABLE]
where an,r is the number of trees of size n which are empty
after reducing r-times. We have an,r=Cn−1−[zn]Gr(z,1).
In addition to Xn,r, we also consider the random variables In,r and Ln,r
that model the number of inner nodes and leaves, respectively, that remain after reducing
a random tree with n nodes r times. The generating functions corresponding to In,r
and Ln,r are Gr(z,v,1) and Gr(z,1,v), respectively.
The relations Xn,r=dIn,r+Ln,r and In,r=dXn,r+1 hold by the combinatorial interpretation of the operator Φ.
2.3. Asymptotic Analysis
We find explicit generating functions for the factorial moments of the random variables
Xn,r, In,r, and Ln,r.
Proposition 2.9**.**
The dth factorial moments of Xn,r, In,r and Ln,r are given by
[TABLE]
and
[TABLE]
where z=u/(1+u)2 for d∈Z≥1.
Remark*.*
For d≥2, udN~d−1(u−1) can be replaced by N~d−1(u) in (15),
see (5).
Proof.
We use the abbreviations
[TABLE]
We consider the exponential generating function of ∂d/(∂v)dGr(z,v) to be a Taylor series and obtain
[TABLE]
By Proposition 2.8, extracting the coefficient of qd yields
The expected value of Xn+1,r is explicitly given by
[TABLE]
Proof.
Using Proposition 2.9 and Cauchy’s integral formula, we have
[TABLE]
where γ is a circle around [math] with a sufficiently small radius such that
γ′, the image of γ under the transformation, is a small contour circling
[math] exactly once as well.
Expanding (1−ur+1)−1 into a geometric series and exchanging
integration and summation, we obtain
[TABLE]
which implies the result.
∎
Having determined a closed form for this generating function allows us to analyze the
asymptotic behavior of Xn,r in a relatively
straightforward way.
Theorem 1**.**
Let r∈N0 be fixed and consider n→∞. Then the expected size and the corresponding variance of an
r-fold cut plane tree are given by
[TABLE]
and
[TABLE]
The factorial moments are asymptotically given by
[TABLE]
for d≥1. Note that all O-constants above depend implicitly on r.
Proof.
In a nutshell, we want to extract the growth of the derivatives of the generating functions ∂vd∂dGr(z,1), as dividing these quantities by Cn−1 yields the
factorial moments. We want to extract the growth by
means of singularity analysis (cf. [10]).
In order to do so, we first need to establish the location of the dominant singularity
of these generating functions, which are explicitly given in (14).
The singularities of (14) are roots of
unity in terms of u. Substituting back u=(1−1−4z)/(2z)−1 maps these roots
of unity to real numbers greater or equal to 1/4 and only u=1
is mapped to z=1/4. Thus z=1/4 is the dominant singularity of
(14). A more detailed treatment of these analytic properties of the
substitution z=u/(1+u)2 can be found in [13, Proposition
2.3].
As N0(x)=1, we obtain the expansion
[TABLE]
for the function on the right-hand side of (14) with d=1.
Then, the expansion
[TABLE]
for fixed κ∈C yields
[TABLE]
By singularity analysis, the nth coefficient, normalized by Cn−1, is asymptotically
[TABLE]
using
[TABLE]
The higher order factorial moments follow similarly by expanding the function on the
right-hand side of (14) for general d>1 around u=1 with the help of
SageMath, where in particular the explicit values of the derivatives of the Narayana
polynomials from Proposition 2.3 are required.
Singularity analysis of the resulting expansion yields the expression given in the
statement of the theorem. Finally, note that the variance can be computed by using
[TABLE]
∎
Theorem 2**.**
The size Xn,r of the tree obtained from a random plane tree with n
nodes by cutting it r-times is, after standardization,
asymptotically normally distributed for n→∞ and
fixed r,
i.e.,
[TABLE]
To be more precise, for x∈R we have
[TABLE]
with μ=r+11 and σ2=6(r+1)2r(r+2) and where the
O-constant depends implicitly on r.
As In,r−1=dXn,r, the same also holds for this
random variable.
The rest of this section is devoted to the proof of this central limit
theorem. In order to derive the fact that the number of remaining nodes after r
reductions is asymptotically normally distributed, we first show that the number of nodes
that are deleted after r reductions is asymptotically normally distributed. Then, as the
sum of the number of remaining nodes and the number of deleted nodes is equal to the
original tree size, we obtain immediately that the number of remaining nodes has to be
asymptotically normally distributed as well.
We begin by considering the function Fr:T→N0 which
maps a plane tree τ to the number of nodes that are deleted when reducing the
tree r times, i.e. the difference between the size of τ and the size of
ρr(τ). Let τn now denote a plane tree with n nodes.
For the sake of convenience, we consider Fr(τn) to be n if r is
larger than the maximal number of reductions that can be applied to τn before the
tree cannot be reduced further. In particular, this means that Fr(\leavevmodeto7.47pt\vboxto7.47pt\pgfpicture\makeatletter\lower-3.73589ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@moveto3.53589pt0.0pt\pgfsys@curveto3.53589pt1.95284pt1.95284pt3.53589pt0.0pt3.53589pt\pgfsys@curveto-1.95284pt3.53589pt-3.53589pt1.95284pt-3.53589pt0.0pt\pgfsys@curveto-3.53589pt-1.95284pt-1.95284pt-3.53589pt0.0pt-3.53589pt\pgfsys@curveto1.95284pt-3.53589pt3.53589pt-1.95284pt3.53589pt0.0pt\pgfsys@closepath\pgfsys@moveto0.0pt0.0pt\pgfsys@fillstroke\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpicture)=1
for r≥1.
It is easy to see that the parameter Fr(τn) is a so-called additive tree
parameter, meaning that
[TABLE]
holds, where τi1, …, τiℓ are the subtrees rooted at the children of
the root of τn, and fr:T→{0,1} is a toll
function recursively defined by
[TABLE]
for r≥1 and f0(τn)=0.
In order to prove asymptotic normality for additive tree parameters, we can use
[25, Theorem 2], which requires us to show that the expected
value of the toll function is exponentially decreasing in n. This is done in the
following lemma.
Lemma 2.11**.**
The expected value of fr(τn) is exponentially decreasing in n.
Remark*.*
Of course, n−Fr(τn) is also an additive parameter. However,
the expected value of the corresponding growth function is not
exponentially decreasing.
Proof.
Define
[TABLE]
and the corresponding generating function
[TABLE]
Observe that Fr(τn)=n holds if and only if τn has height less than
r, as removing all leaves from a tree reduces its height by precisely one. Therefore,
the generating function Qr(z) is the generating function enumerating trees of height
less than r.
It is well-known (cf. [3]) that the generating function for
plane trees of height less than r can be expressed in terms of Fibonacci
polynomials as
[TABLE]
The roots of Fr(−z) are also well-known and can be written as
αj,r=(4cos2(jπ/r))−1 for j=1, …, ⌊(r−1)/2⌋.
Thus Qr(z) is a rational function and its coefficients have the form
[TABLE]
for constants cj,r. We have ∣αj,r∣>4. As
[TABLE]
there exists a constant c∈(0,1) such that qn,r=O(cn).
∎
Thus, by the strategy discussed above, we find that not only Fr(τn) but also
Xn,r=n−Fn,r is asymptotically normally distributed.
Remark*.*
Note that the fact that F1(τn) is asymptotically normally distributed means that the Narayana numbers are
asymptotically normally distributed, see for example [7, Theorem 3.13].
As sketched above, Lemma 2.11 allows us to
apply [25, Theorem 2] in order to prove that Fr(τn),
and therefore also Xn,r=n−Fr(τn) is asymptotically normally
distributed. All that remains to prove is that the speed of convergence is O(n−1/2).
We do so by noting that the proof for asymptotic normality in Wagner’s theorem
is based on [7, Theorem 2.23], where a version of Hwang’s
Quasi-Power Theorem [16] without quantification of the speed of convergence is used. Replacing
this argument with the multi-dimensional quantified version given
in [15] then gives us the desired speed of convergence
of O(n−1/2).
3. Cutting Paths
3.1. The Expansion Operator and Results
Let P denote the combinatorial class of paths, i.e. trees in which every node
is either a leaf or has precisely one child. The tree reduction ρ:T∖P→T which we will focus on in this section reduces
a tree by cutting away all paths of the tree. This operation is illustrated in
Figure 4.
Analogously to our approach in Section 2.2, we first determine
the corresponding expansion operator Φ. In order to do so, we need the generating
function for the family of paths P, which is given by P=P(z,t)=1−zt. For the sake of readability, we omit the arguments of P.
Proposition 3.1**.**
Let F⊆T be a family of plane trees with bivariate
generating function f(z,t), where z marks inner nodes and t marks leaves. Then the
generating function for ρ−1(F), the family of trees whose reduction is
in F, is given by
[TABLE]
Proof.
The fact that Φ is a linear operator is obvious from a combinatorial point of view,
meaning that we may concentrate on some tree τ with n inner nodes and k leaves,
represented by zntk.
We follow the proof of Proposition 2.5 and observe that all
possible tree expansions of τ can be obtained by the following
operations: the leaves of τ are expanded by appending a sequence of at least two paths to each
of them. Note that appending a single path to a leaf is not allowed, because this would just
extend the path ending in that leaf, which causes ambiguity. Then, the inner nodes are
expanded as well by appending (possibly empty) sequences of paths to the 2n+k−1
available positions between, before, and after their children.
Translating this expansion to the language of generating functions yields
The generating function for plane trees T(z,t) satisfies the functional
equation
[TABLE]
Proof.
Surjectivity of ρ implies ρ−1(T)=T∖P, which proves the statement after translating this into
the language of generating functions with the help of Φ.
∎
In the following proposition, we determine the generating function Gr(z,vI,vL) measuring
the effect of applying the path reduction r times on the size of the tree. Most
interestingly, we will see that the path connection is in fact strongly related to the leaf
reduction from the previous section.
Proposition 3.3**.**
The trivariate generating function Gr(z,vI,vL)=GrP(z,vI,vL) enumerating plane trees whose
paths can be cut at least r-times, where z marks the tree size and vI and vL mark the
number of inner nodes and leaves of the r-fold cut tree, respectively, is given by
[TABLE]
where z=u/(1+u)2.
Proof.
By the same reasoning as in the proof of Proposition 2.8, the
generating function we are interested in is Gr(z,vI,vL)=Φr(T(zvI,tvL))∣t=z,
meaning that we want to study the iterated application of Φ. To do so, we consider
the strongly related operator
[TABLE]
The relation
[TABLE]
can be proved easily by induction and enables us to determine the behavior of Φ via
Ψ.
First of all, for r≥0 and r≥1, the relations
[TABLE]
can be proved easily by induction, respectively. Also observe that we can write Ψr(t)=Ψr(z)Ψr−1(P)2. Now let fr=Ψr(z)∣t=z, gr=Ψr(t)∣t=z, and hr=Ψr(P)∣t=z. With the help of the identity
∏j=0r(1+u2j)=1−u1−u2r+1 we are able to prove the
explicit formula
[TABLE]
where z=u/(1+u)2 and the second equation is a consequence
of (8). Using (23), we immediately find
[TABLE]
Putting everything together yields
[TABLE]
which directly implies the statement.
∎
The following result shows that there is an intimate connection between the “cutting
leaves”-reduction from Section 2 and the “cutting
paths”-reduction, as can be seen after comparing the statement of
Proposition 2.8 with the statement of Proposition 3.3.
Corollary 3.4**.**
The generating function Gr(z,vI,vL)=GrP(z,vI,vL) measuring the change in size
after cutting away all paths from plane trees r times is equal to the generating
function G2r+1−2L(z,vI,vL) measuring the change in size after
cutting away all leaves from plane trees 2r+1−2 times.
This connection is now especially important for the analysis of the random variable
Xn,r=Xn,rP modeling the number of nodes that are left after
reducing a random tree τ with n nodes r times by removing all paths. In fact, it
follows that
[TABLE]
meaning that the asymptotic analysis of the factorial moments of
Xn,rP as well as the limiting distribution follow directly from the
corresponding results in Section 2.3.
Theorem 3**.**
Let r∈N0 be fixed and consider n→∞. Then expectation and variance of the
random variable Xn,r=Xn,rP can be expressed as
[TABLE]
and
[TABLE]
The factorial moments are asymptotically given by
[TABLE]
Furthermore, Xn,r=Xn,rP is asymptotically normally distributed,
i.e., for x∈R we have
[TABLE]
for μ=2r+1−11 and σ2=3(2r+1−1)22r+1(2r−1). All O-constants in this theorem depend implicitly on r.
3.2. Total number of paths
In the context of this reduction it is interesting to investigate the total number of
paths needed to construct a given tree.
To determine this parameter we can reduce the tree repeatedly
and count the number of leaves. The sum of the number of leaves over all reduction steps
is equal to the number of paths, which follows from the observation that leaves mark the
endpoints of all paths.
Formally, given the random variables Pn,r counting the number of leaves in the rth
reduction of a tree of size n, we want to analyze the random variable Pn:=∑r≥0Pn,r.
Proposition 3.5**.**
The expected number of paths needed to construct a uniformly random tree of size n satisfies
[TABLE]
where z=u/(1+u)2.
Proof.
As a consequence of Proposition 3.3, the bivariate generating function
enumerating plane trees where z marks tree size and v marks the number of
leaves after r path reductions can be written as
[TABLE]
By differentiating this generating function once with respect to v and setting v=1
afterwards, we obtain an expression where Cn−1EPn,r can be extracted as the
coefficient of zn. By (15) with d=1 and r replaced by
2r+1−2, we have
[TABLE]
Summation over r≥0 and shifting the index of summation by one completes the
proof.
∎
Our strategy for determining an asymptotic expansion for EPn as given
in (26) is based on the Mellin transform.
Theorem 4**.**
For n→∞, the expected number of paths required to construct a uniformly random
tree of size n is given by the asymptotic expansion
[TABLE]
where
[TABLE]
with χk=log22kπi is a fluctuation with mean [math] and α:=∑k≥11/(2k−1)≈1.606695, γ is the Euler–Mascheroni constant and ζ is the Riemann zeta function.
Remark*.*
The constant α appears in the asymptotic analysis of digital search
trees (see e.g. [20]).
Proof.
In order to obtain an asymptotic expansion from (26), we
rewrite
[TABLE]
where z=u/(1+u)2.
The main task to obtain an asymptotic expansion of P(z) is to provide a precise analysis of
this sum, which we carry out via the Mellin transform. We consider the function
[TABLE]
obtained from substituting u=e−t in the sum above. With
[TABLE]
we find that the corresponding Mellin transform of this difference of harmonic sums is
given by
[TABLE]
with fundamental strip ⟨1,∞⟩. In order for the inversion formula to
be valid, we need to show that f∗(s) decays sufficiently fast along vertical lines
in the complex plane. While Γ(s) and ζ(s) are well-known to decay
exponentially and grow polynomially along vertical lines, respectively, the Dirichlet
series A(s) has to be investigated in more detail.
We want to estimate the summands in
[TABLE]
To do so, we consider g(x)=(1−x)−s as a function of a real variable. By means of
the integral form of the Taylor approximation error we find
[TABLE]
where the last inequality is valid under the assumption that Res>−2. Using this
estimate, we find
[TABLE]
where the sum converges for Res>−2. Therefore, A(s) has polynomial growth in Ims for
Res>−2 and \operatorname{Im}s=\frac{2\pi i}{\log 2}\big{(}k+\frac{1}{2}\big{)}, where
k∈Z and ∣k∣→∞, as well as on vertical lines with Res>−2 and Res=−1. This implies that f∗(s) decays sufficiently
fast, and thus the inversion formula states
[TABLE]
which is valid for real, positive t→0 (and thus u→1− and z→(1/4)−, as we have z=u/(1+u)2 and u=e−t). In order to extract the
coefficient growth (in terms of z) with the help of singularity analysis, we require
analyticity in a larger region (cf. [10]), e.g. in a
complex punctured neighborhood of 1/4 with111Note that the bound 2π/5 is
somewhat arbitrary: the argument just needs to be less than π/2. ∣arg(z−1/4)∣>2π/5.
Substituting back t for z, we find
[TABLE]
which implies
[TABLE]
such that we have the bound ∣argt∣<2π/5 for t→0, given that the
restriction on the argument in terms of z is satisfied.
With the help of our estimates on f∗(s) that we discussed above, we find that
[TABLE]
for −3/2≤Res≤2 and \operatorname{Im}s=\frac{2\pi i}{\log 2}\big{(}k+\frac{1}{2}\big{)}, where k∈Z and ∣k∣→∞. This is a consequence of
combining the quantified growth of Γ(s) (see [6, 5.11.3])
and the growth of ζ(s) (see [26, 13.51]) with the facts
that A(s) is of order O(Im(s)2) and 2s+1−1s is of order
O(Im(s)) for s taking values in the specified region.
We can evaluate (29) by shifting the line of integration
from Re(s)=2 to Re(s)=−3/2 and collecting the residues of the poles we
cross. This yields
[TABLE]
where P={−1,1}∪{−1+χk∣k∈Z∖{0}}.
For the error term we use the estimate above and find
[TABLE]
Evaluating the residues yields
[TABLE]
Note that with α:=∑k≥11/(2k−1), we have A(1)=α−1.
When substituting back in order to obtain an expansion in terms of z→1/4, we have
to carefully check that the error terms within the sum of the residues at χk for
k∈Z∖{0} can still be controlled. Considering that for some exponent
κ, we have the expansion
[TABLE]
and thus
[TABLE]
Setting κ=−1+χk shows that the errors that we sum are of order O(∣k∣(1−4z)exp(∣k∣O(1−4z))).
Choosing z sufficiently close to 1/4 ensures that the
exponential growth is negligible compared to the exponential decay proved
in (30).
Finally, it is easy to see that the factor 1+uu can be rewritten as 21−1−4z. Multiplying our expansion of f(t) with this factor and substituting back
yields the expansion
[TABLE]
Applying singularity analysis, normalizing the result by Cn−1, and rewriting the
coefficients of the contributions from the poles at −1+χk via the duplication
formula for the Gamma function (cf. [6, 5.5.5]) then proves the asymptotic expansion for
EPn.
∎
4. Cutting Old Leaves
4.1. Preliminaries
In this section we consider a slightly more complex reduction: instead of removing all
leaves, we just remove all leftmost
leaves. Following [2], we call a leaf that is a
leftmost child an old leaf.
In order to describe the corresponding expansion in the language of generating functions,
we need to change our underlying combinatorial model of trees in a way that specifically
marks old leaves.
Let L be the combinatorial class of plane trees where ■
marks old leaves and marks all
nodes that are neither old leaves nor parents thereof. Now, as a first step we determine
the bivariate generating function L(z,w) of L.
Proposition 4.1**.**
The generating function L(z,w) enumerating plane trees
with respect to old leaves ■
(marked by the variable w) and all nodes that are neither old leaves nor parents
thereof (marked by z) is given by
[TABLE]
For n≥2 there are Ck−1(n−2kn−2)2n−2k plane trees of
size n (meaning n nodes overall) with k old leaves.
For example in Figure 5, the original tree corresponds to z3w3
because it has three old leaves (dashed nodes) and three nodes which are neither old
leaves nor parents of old leaves.
Proof.
We consider the symbolic equation describing the combinatorial class
L of plane trees with respect to old leaves, which is illustrated
in Figure 6.
The functional equation that can be derived from the symbolic equation by marking
■ with w and with z is
[TABLE]
Solving this equation and choosing the correct branch of the root
yields (31).
To extract coefficients of L(z,w), we rewrite it as
[TABLE]
∎
As we will see in the next section, the polynomials defined below will play a similar role
for the “old leaves”-reduction as the Fibonacci polynomials played for the “leaves”-
and “paths”-reduction.
Definition**.**
The polynomials Br(z) are the generating functions of binary trees w.r.t. the
number of internal nodes of height ≤r satisfying
[TABLE]
for r≥1 and B0(z)=1.
4.2. The Expansion Operator and Asymptotic Results
As described in the previous section, we now concentrate on the reduction ρ:L→L, which removes all old leaves from a
tree. Note that ρ(\leavevmodeto7.47pt\vboxto7.47pt\pgfpicture\makeatletter\lower-3.73589ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@moveto3.53589pt0.0pt\pgfsys@curveto3.53589pt1.95284pt1.95284pt3.53589pt0.0pt3.53589pt\pgfsys@curveto-1.95284pt3.53589pt-3.53589pt1.95284pt-3.53589pt0.0pt\pgfsys@curveto-3.53589pt-1.95284pt-1.95284pt-3.53589pt0.0pt-3.53589pt\pgfsys@curveto1.95284pt-3.53589pt3.53589pt-1.95284pt3.53589pt0.0pt\pgfsys@closepath\pgfsys@moveto0.0pt0.0pt\pgfsys@fillstroke\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpicture)=\leavevmodeto7.47pt\vboxto7.47pt\pgfpicture\makeatletter\lower-3.73589ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@moveto3.53589pt0.0pt\pgfsys@curveto3.53589pt1.95284pt1.95284pt3.53589pt0.0pt3.53589pt\pgfsys@curveto-1.95284pt3.53589pt-3.53589pt1.95284pt-3.53589pt0.0pt\pgfsys@curveto-3.53589pt-1.95284pt-1.95284pt-3.53589pt0.0pt-3.53589pt\pgfsys@curveto1.95284pt-3.53589pt3.53589pt-1.95284pt3.53589pt0.0pt\pgfsys@closepath\pgfsys@moveto0.0pt0.0pt\pgfsys@fillstroke\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpicture, as
the root itself is not an old leaf.
We begin our analysis of this reduction by determining the expansion operator Φ.
Proposition 4.2**.**
Let F⊆L be a family of plane trees with bivariate
generating function f(z,w), where z marks nodes that are neither old leaves
nor parents thereof and w marks old leaves. Then the generating function
for ρ−1(F), the family of trees whose reduction is in F,
is given by
[TABLE]
Proof.
Linearity of Φ is obvious from the combinatorial interpretation, meaning that we
can focus on the expansion of any tree represented by znwk, i.e. a tree with n
nodes that are neither old leaves nor parents thereof and k old leaves.
Figure 7 illustrates all three possibilities to expand an old leaf ■:
–
appending an old leaf to the parent of ■, which turns the original
old leaf into ,
–
appending an old leaf to ■ itself, which turns the parent into
,
–
appending old leaves both to ■ and its parent.
In terms of generating functions, this means that w is substituted by 2zw+w2.
Furthermore, the nodes represented by can optionally be expanded by attaching an old
leaf to them, otherwise they stay as they are. This option corresponds to the substitution z↦z+w.
There are no more operations to expand the tree, so putting everything together yields
[TABLE]
which proves the statement.
∎
An immediate consequence of the fact that ρ:L→L is
surjective is the following corollary.
Corollary 4.3**.**
The generating function for plane trees L(z,w) satisfies the functional
equation
[TABLE]
We now focus on determining the generating function measuring the change in the tree size
after repeatedly applying the reduction ρ.
Proposition 4.4**.**
Let r∈N0. The bivariate generating function Gr(z,v)=GrOL(z,v) enumerating plane trees, where z marks the tree
size and v marks the size of the r-fold cut tree, is given by
[TABLE]
where the Br(z) are the polynomials enumerating binary trees of height ≤r
w.r.t. the number of internal nodes.
Proof.
First, note that the size of a tree with k old leaves and n nodes
that are neither old leaves nor parents thereof is actually
n+2k, as parents of old leaves are not explicitly marked. This explains why we have to
substitute w=z2 in order to arrive at the tree size.
In contrast to the previous sections, the operator Φ is already linear and
multiplicative, meaning that we have
[TABLE]
Investigating the repeated application of Φ to z and w leads to the recurrences
[TABLE]
for r≥2 and r≥0, respectively.
With the recurrence for the polynomials Br from (35) it is easy to prove by induction that
[TABLE]
for r≥0. Thus, we also find Φr(w)∣w=z2=z(Br+1(z)−Br(z)).
Overall, we obtain
[TABLE]
which, by linearity of Φ, proves the proposition.
∎
For the next step in our analysis, we turn to the random variable Xn,r=Xn,rOL which models the size of the tree
that results from reducing a random tree τ with n nodes r-times.
As we have ρ(\leavevmodeto7.47pt\vboxto7.47pt\pgfpicture\makeatletter\lower-3.73589ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@moveto3.53589pt0.0pt\pgfsys@curveto3.53589pt1.95284pt1.95284pt3.53589pt0.0pt3.53589pt\pgfsys@curveto-1.95284pt3.53589pt-3.53589pt1.95284pt-3.53589pt0.0pt\pgfsys@curveto-3.53589pt-1.95284pt-1.95284pt-3.53589pt0.0pt-3.53589pt\pgfsys@curveto1.95284pt-3.53589pt3.53589pt-1.95284pt3.53589pt0.0pt\pgfsys@closepath\pgfsys@moveto0.0pt0.0pt\pgfsys@fillstroke\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpicture)=\leavevmodeto7.47pt\vboxto7.47pt\pgfpicture\makeatletter\lower-3.73589ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@moveto3.53589pt0.0pt\pgfsys@curveto3.53589pt1.95284pt1.95284pt3.53589pt0.0pt3.53589pt\pgfsys@curveto-1.95284pt3.53589pt-3.53589pt1.95284pt-3.53589pt0.0pt\pgfsys@curveto-3.53589pt-1.95284pt-1.95284pt-3.53589pt0.0pt-3.53589pt\pgfsys@curveto1.95284pt-3.53589pt3.53589pt-1.95284pt3.53589pt0.0pt\pgfsys@closepath\pgfsys@moveto0.0pt0.0pt\pgfsys@fillstroke\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpicture (and thus no trees vanish completely), the
probability generating function for this random variable is simply
[TABLE]
While the height polynomials Br(z) make it very difficult to obtain general results
for the factorial moments of Xn,r, special moments like
expectation and variance are no problem, and even a central limit theorem is possible.
Theorem 5**.**
Let r∈N0 be fixed and consider n→∞. Then the expected tree size after deleting old leaves
of a tree with n nodes r-times and the corresponding variance are given by
[TABLE]
and
[TABLE]
All O-constants in this theorem depend implicitly on r.
Additionally, the random variable Xn,r is asymptotically normally distributed for
fixed r≥1, i.e.
[TABLE]
where μ=(2−Br(1/4)) and \sigma^{2}=\big{(}B_{r}(1/4)-B_{r}(1/4)^{2}+\frac{(2-B_{r}(1/4))B_{r}^{\prime}(1/4)}{2}\big{)}.
Proof.
First of all, we observe that
Proposition 4.1 and Proposition 4.4 combined with the
recursion Br(z)=1+zBr−1(z)2 allow us to write the bivariate generating
function as
[TABLE]
The asymptotic expansion for the expected value EXn,r can now be obtained by
determining
[TABLE]
By means of singularity analysis we find
[TABLE]
which proves (37). For the second factorial moment we obtain
[TABLE]
which yields
[TABLE]
The variance can now be obtained via VXn,r=EXn,r2+EXn,r−(EXn,r)2, which proves (38).
In order to show asymptotic normality of Xn,r we investigate the random variable n−Xn,r, which counts the number of nodes that are deleted after reducing some tree r
times. Observe that this quantity can be seen as an additive tree parameter Fr
defined recursively by
[TABLE]
where τn is some tree of size n, τi1 up to τiℓ are
the subtrees rooted at the children of the root of τn, and fr:L→{0,1,…,r−1} is a toll function defined by
[TABLE]
for r≥1. Now, as fr(τn) enumerates the number of old leaves deleted
from the root of τn after r reductions, Fr(τn) equals the total
number of deleted nodes after r reductions.
The fact that r is fixed implies that fr is not only bounded, but also a
so-called local functional, meaning that the value of fr(τn) can
already be determined from the first r levels of τn. This is because one
application of ρ can reduce the distance between the root of the tree and the
closest old leaf by at most one. Thus all old leaves that are deleted from the root
during r reductions have to be found within the first r levels of τn.
As we have now established that fr is both bounded and a local functional, we are
able to apply [18, Theorem 1.13], which proves that n−Xn,r is asymptotically normally distributed. Thus Xn,r is
asymptotically normally distributed as well, which proves the statement.
∎
Remark*.*
In [9], the asymptotic behavior of a sequence strongly related
to Br(1/4) was studied: in Section 4, the authors define a sequence fn such that
fr+1=21−4Br(1/4), in our notation. They prove the
asymptotic expansion fn=n+logn+O(1)1. This allows us to conclude
that the asymptotic behavior of Br(1/4) can be described as
[TABLE]
for r→∞.
5. Cutting Old Paths
5.1. The Expansion
Operator
As in previous sections, we adapt the “old leaves” reduction
to remove all “old paths”. That is, the tree reduction ρ:L→L in this section reduces a tree by removing all paths that
end in an old leaf. This operation is illustrated in
Figure 8, where ■ marks old leaves and
marks all nodes that are neither old leaves nor parents thereof.
Obviously, we also need the combinatorial class of paths P for our
analysis. The bivariate generating function of P is given by P=P(z,w)=1−zw, where
w and z mark ■ and ,
respectively. Also, we omit the arguments of P for the sake of
readability. Now, we determine the shape of the expansion operator Φ.
Proposition 5.1**.**
Let F⊆L be a family of plane trees with bivariate
generating function f(z,w), where z marks nodes that are neither old leaves
nor parents thereof and w marks old leaves. Then the generating function
for ρ−1(F), the family of trees whose reduction is in F,
is given by
[TABLE]
Proof.
With linearity of the operator Φ being obvious from a combinatorial point of view,
we only have to investigate the expansion of any tree represented by znwk,
i.e. a tree with n nodes that are neither old leaves nor parents thereof and k old
leaves.
There are two options to expand an old leaf ■:
–
either appending an old path to the parent of ■, which
turns the old leaf into ,
–
or an old path is appended to both the parent of ■ and to
■ itself.
Note that just appending an old path to ■ is not a valid expansion as this
introduces ambiguity. This is the same argument that we also used in the proof of
Proposition 3.1. Overall, this means that Φ has to map w
to zP+P2.
On the other hand, the nodes represented by can optionally be expanded by
attaching an old path. Otherwise they stay as they are. Overall, this implies Φ(z)=z+P.
Putting everything together, we immediately arrive at the statement of the Proposition.
∎
Analogously to the previous reductions, surjectivity of ρ:L→L implies the following
corollary.
Corollary 5.2**.**
The generating function for plane trees L(z,w) satisfies the functional
equation
[TABLE]
In order to carry out a detailed analysis of this reduction, we need information about the
iterated application of Φ to L(zvI,wvL2), which leads to the generating function
Gr(z,vI,vL2) measuring the change in the tree size after r applications of the
reduction. The following proposition deals with determining this generating function.
Proposition 5.3**.**
Let r∈N0. The trivariate generating function Gr(z,vI,vL2)=GrOP(z,vI,vL2) enumerating plane trees, where z marks the tree
size, vL marks all old leaves, and vI marks all nodes that are neither old
leaves nor parents thereof, is given by
[TABLE]
where z=u/(1+u)2.
Proof.
Observe that the operator Φ is already linear and multiplicative, which is why we
can concentrate on finding suitable expressions for Φr(z) and Φr(w).
First of all, for r≥1 the recurrences
[TABLE]
follow immediately from (39). Furthermore, the relation
[TABLE]
can easily be proved by induction. Then, by setting fr:=Φr(z)∣w=z2
the recurrences above translate to
[TABLE]
As a next step, we show by induction that fr can be expressed in terms of Fibonacci
polynomials as
[TABLE]
where in particular (7) was used. As a consequence, we find
[TABLE]
This allows us to express gr:=Φr(w)∣w=z2 as
[TABLE]
Finally, as we have Φr(znwk)∣w=z2=frngrk,
substituting z=u/(1+u)2 and using (8) completes the proof.
∎
5.2. Analysis of Tree Size and Related Parameters
We investigate the behavior of the random variable Xn,r=Xn,rOP
which models the number of nodes remaining after reducing a random tree
τ with n nodes r-times. The tree τ is chosen uniformly
among all trees of size n. Analogously to the “old leaf”-reduction from the previous
section, we also have ρ(\leavevmodeto7.47pt\vboxto7.47pt\pgfpicture\makeatletter\lower-3.73589ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@moveto3.53589pt0.0pt\pgfsys@curveto3.53589pt1.95284pt1.95284pt3.53589pt0.0pt3.53589pt\pgfsys@curveto-1.95284pt3.53589pt-3.53589pt1.95284pt-3.53589pt0.0pt\pgfsys@curveto-3.53589pt-1.95284pt-1.95284pt-3.53589pt0.0pt-3.53589pt\pgfsys@curveto1.95284pt-3.53589pt3.53589pt-1.95284pt3.53589pt0.0pt\pgfsys@closepath\pgfsys@moveto0.0pt0.0pt\pgfsys@fillstroke\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpicture)=\leavevmodeto7.47pt\vboxto7.47pt\pgfpicture\makeatletter\lower-3.73589ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@moveto3.53589pt0.0pt\pgfsys@curveto3.53589pt1.95284pt1.95284pt3.53589pt0.0pt3.53589pt\pgfsys@curveto-1.95284pt3.53589pt-3.53589pt1.95284pt-3.53589pt0.0pt\pgfsys@curveto-3.53589pt-1.95284pt-1.95284pt-3.53589pt0.0pt-3.53589pt\pgfsys@curveto1.95284pt-3.53589pt3.53589pt-1.95284pt3.53589pt0.0pt\pgfsys@closepath\pgfsys@moveto0.0pt0.0pt\pgfsys@fillstroke\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpicture for the “old path”-reduction,
meaning that no trees vanish completely. For the sake of convenience we set Gr(z,v):=Gr(z,v,v2), allowing us to write the probability generating function of
Xn.r as
[TABLE]
With the help of Proposition 5.3, it is easy to obtain expressions for
the factorial moments EXn,rd for fixed d by differentiating Gr(z,v)d-times with respect to v and setting v=1 afterwards. General expressions
for d≥2 (coinciding with the value given for d=2) are available but less pleasant.
Lemma 5.4**.**
The factorial moments of Xn,r are
[TABLE]
and
[TABLE]
for d≥2.
Proof.
The expressions for d∈{1,2} can be obtained by differentiation. We
consider the general case here.
We use the abbreviations
[TABLE]
By the same argument as in the proof of Proposition 2.9, we
have
and proceeding as in Corollary 2.10 we
obtain the given result.
∎
By expanding the expressions in Lemma 5.4
and using singularity analysis, we obtain the asymptotic growth of the expected value and
the variance.
Theorem 6**.**
Let r∈N be fixed and consider n→∞. Then the expected size and the corresponding variance of an
r-fold cut plane tree are given by
[TABLE]
and
[TABLE]
For d≥3, the dth factorial moment is
[TABLE]
All O-constants in this theorem depend implicitly on r.
Besides the analysis of the tree size, we are also interested in how the
numbers of nodes represented by ■ and by develop when the tree is
reduced repeatedly. Formally, this means that we consider the random
variables Xn,r■ and Xn,r\leavevmodeto4.64pt\vboxto4.64pt\pgfpicture\makeatletter\lower-2.32133ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@moveto2.12134pt0.0pt\pgfsys@curveto2.12134pt1.17159pt1.17159pt2.12134pt0.0pt2.12134pt\pgfsys@curveto-1.17159pt2.12134pt-2.12134pt1.17159pt-2.12134pt0.0pt\pgfsys@curveto-2.12134pt-1.17159pt-1.17159pt-2.12134pt0.0pt-2.12134pt\pgfsys@curveto1.17159pt-2.12134pt2.12134pt-1.17159pt2.12134pt0.0pt\pgfsys@closepath\pgfsys@moveto0.0pt0.0pt\pgfsys@fillstroke\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpicture counting the number of old leaves and
the number of all nodes that are neither old leaves nor parents thereof, respectively. By
construction, the relation
[TABLE]
holds.
The bivariate generating functions corresponding to these random variables can be obtained
directly from Proposition 5.3. We have
[TABLE]
In contrast to Xn,r, the dth factorial moments for
Xn,r■ and Xn,r\leavevmodeto4.64pt\vboxto4.64pt\pgfpicture\makeatletter\lower-2.32133ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@moveto2.12134pt0.0pt\pgfsys@curveto2.12134pt1.17159pt1.17159pt2.12134pt0.0pt2.12134pt\pgfsys@curveto-1.17159pt2.12134pt-2.12134pt1.17159pt-2.12134pt0.0pt\pgfsys@curveto-2.12134pt-1.17159pt-1.17159pt-2.12134pt0.0pt-2.12134pt\pgfsys@curveto1.17159pt-2.12134pt2.12134pt-1.17159pt2.12134pt0.0pt\pgfsys@closepath\pgfsys@moveto0.0pt0.0pt\pgfsys@fillstroke\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpicture have simpler expressions.
Proposition 5.6**.**
Let d∈N. Then the dth factorial moments of Xn,r■ and
Xn,r\leavevmodeto4.64pt\vboxto4.64pt\pgfpicture\makeatletter\lower-2.32133ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@moveto2.12134pt0.0pt\pgfsys@curveto2.12134pt1.17159pt1.17159pt2.12134pt0.0pt2.12134pt\pgfsys@curveto-1.17159pt2.12134pt-2.12134pt1.17159pt-2.12134pt0.0pt\pgfsys@curveto-2.12134pt-1.17159pt-1.17159pt-2.12134pt0.0pt-2.12134pt\pgfsys@curveto1.17159pt-2.12134pt2.12134pt-1.17159pt2.12134pt0.0pt\pgfsys@closepath\pgfsys@moveto0.0pt0.0pt\pgfsys@fillstroke\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpicture are given by
For deriving ∂d/(∂v)dGr\leavevmodeto4.64pt\vboxto4.64pt\pgfpicture\makeatletter\lower-2.32133ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@moveto2.12134pt0.0pt\pgfsys@curveto2.12134pt1.17159pt1.17159pt2.12134pt0.0pt2.12134pt\pgfsys@curveto-1.17159pt2.12134pt-2.12134pt1.17159pt-2.12134pt0.0pt\pgfsys@curveto-2.12134pt-1.17159pt-1.17159pt-2.12134pt0.0pt-2.12134pt\pgfsys@curveto1.17159pt-2.12134pt2.12134pt-1.17159pt2.12134pt0.0pt\pgfsys@closepath\pgfsys@moveto0.0pt0.0pt\pgfsys@fillstroke\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpicture(z,v), we proceed
as in the proof of Proposition 2.9. The crucial
identity is
As in Section 2.3, the above proof exhibits some
identities:
Remark*.*
For d∈Z≥1, the power series identities
[TABLE]
and
[TABLE]
hold.
Proof.
We replace ur+1 by x in the proof of
Proposition 5.6 and expand L by
(34).
∎
The asymptotic behavior for the factorial moments of Xn,r■ and
Xn,r\leavevmodeto4.64pt\vboxto4.64pt\pgfpicture\makeatletter\lower-2.32133ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@moveto2.12134pt0.0pt\pgfsys@curveto2.12134pt1.17159pt1.17159pt2.12134pt0.0pt2.12134pt\pgfsys@curveto-1.17159pt2.12134pt-2.12134pt1.17159pt-2.12134pt0.0pt\pgfsys@curveto-2.12134pt-1.17159pt-1.17159pt-2.12134pt0.0pt-2.12134pt\pgfsys@curveto1.17159pt-2.12134pt2.12134pt-1.17159pt2.12134pt0.0pt\pgfsys@closepath\pgfsys@moveto0.0pt0.0pt\pgfsys@fillstroke\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpicture can now be extracted quite straightforward by means of singularity
analysis from the representation given in Proposition 5.6.
Theorem 7**.**
Let r∈N0 be fixed and consider n→∞. Then the expected number of old
leaves as well as the expected number of nodes that are neither old leaves nor parents
thereof in an r-fold “old path”-reduced tree and the corresponding variances are
given by the asymptotic expansions
[TABLE]
Additionally, for fixed d≥2 the behavior of the factorial moments of
Xn,r■ and Xn,r\leavevmodeto4.64pt\vboxto4.64pt\pgfpicture\makeatletter\lower-2.32133ptto0.0pt\pgfsys@beginscope\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@setlinewidth0.4pt\pgfsys@invoke\nullfont\pgfsys@beginscope\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscopeto0.0pt\pgfsys@beginscope\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@moveto2.12134pt0.0pt\pgfsys@curveto2.12134pt1.17159pt1.17159pt2.12134pt0.0pt2.12134pt\pgfsys@curveto-1.17159pt2.12134pt-2.12134pt1.17159pt-2.12134pt0.0pt\pgfsys@curveto-2.12134pt-1.17159pt-1.17159pt-2.12134pt0.0pt-2.12134pt\pgfsys@curveto1.17159pt-2.12134pt2.12134pt-1.17159pt2.12134pt0.0pt\pgfsys@closepath\pgfsys@moveto0.0pt0.0pt\pgfsys@fillstroke\pgfsys@invoke\pgfsys@beginscope\pgfsys@invoke\pgfsys@transformcm1.00.00.01.00.0pt0.0pt\pgfsys@invoke\definecolorpgfstrokecolorrgb0,0,0\pgfsys@color@rgb@stroke000\pgfsys@invoke\pgfsys@color@rgb@fill000\pgfsys@invoke\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\pgfsys@discardpath\pgfsys@invoke\lxSVG@closescope\pgfsys@endscope\hss\lxSVG@closescope\endpgfpicture is given by
[TABLE]
and
[TABLE]
respectively.
All O-constants in this theorem depend implicitly on r.
5.3. Total number of old paths
Similarly to our approach for counting the total number of paths required to construct a
given tree from Section 3.2, we can also analyze the number of
“old path”-segments within a random tree of size n. Formally, this corresponds to an
analysis of the random variable Sn:=∑r≥0Xn,r■.
Theorem 8**.**
The expected number of “old path” segments within a uniformly random tree of size n
is given asymptotically by
[TABLE]
for n→∞.
Proof.
As we have Sn=∑r≥0Xn,r■, we can
use (41) to write
[TABLE]
The main part of this analysis consists of determining an appropriate expansion of the
sum in the last equation via the Mellin transform.
By setting u=e−t and by means of expanding via the geometric series, we find
[TABLE]
It is easy to determine the corresponding Mellin transform
[TABLE]
with fundamental strip ⟨2,∞⟩. The poles of
f∗(s) are located at s∈{2,1}∪−2N0. As this function behaves very nicely
along vertical lines because of the exponential decay and the polynomial growth of the
gamma function and the zeta function, respectively, we can use the inversion theorem
to find
[TABLE]
for t→0. Analyticity in a larger (complex) region can be obtained analogously to the
approach in the proof of Theorem 4.
Shifting the line of integration to Re(s)=−5 and collecting residues, we find
[TABLE]
As in the proof of Theorem 4, the integral can be
estimated with an error of O(∣t∣5). However, for the sake of simplicity, we will use
the contribution from the singularity at s=−4 as the expansion error. Effectively,
we obtain
[TABLE]
for t→0. Multiplication with the factor 1+u1−u, expansion of everything
in terms of z→1/4, carrying out singularity analysis, and normalizing the result by
dividing by Cn−1 yields the result.
∎
6. Future Work
It seems likely that similar results also hold for reductions where
one can cut a different structure as long as it is allowed to cut a
single leaf. An example is cutting either single leaves or cherries (a
root with two children). At least a formulation as an operator as in (9) seems
possible in general. How much information about the moments and the central limit
theorem can be extracted from that may vary (as it varies in this
article already). Also the case of cutting old structures might be
more difficult to handle in general.
Bibliography27
The reference list from the paper itself. Each links out to its DOI / PubMed record.
1[1] David Callan, Kreweras’s Narayana number identity has a simple Dyck path interpretation , ar Xiv:1203.3999 [math.CO], 2012.
2[2] William Y. C. Chen, Emeric Deutsch, and Sergi Elizalde, Old and young leaves on plane trees , European J. Combin. 27 (2006), no. 3, 414–427. · doi ↗
3[3] Nicolaas G. de Bruijn, Donald E. Knuth, and Stephen O. Rice, The average height of planted plane trees , Graph theory and computing, Academic Press, New York, 1972, pp. 15–22.
4[4] Mireille Vauchaussade de Chaumont, Nombre de Strahler des arbres, languages algébrique et dénombrement de structures secondaires en biologie moléculaire , Doctoral thesis, Université de Bordeaux I, 1985.
5[5] The Sage Math Developers, Sage Math Mathematics Software (Version 7.4) , 2016, http://www.sagemath.org .
6[6] NIST Digital library of mathematical functions , http://dlmf.nist.gov/ , Release 1.0.13 of 2016-09-16, 2016, Frank W. J. Olver, Adri B. Olde Daalhuis, Daniel W. Lozier, Barry I. Schneider, Ronald F. Boisvert, Charles W. Clark, Bruce R. Miller and Bonita V. Saunders, eds.
7[7] Michael Drmota, Random trees , Springer Wien New York, 2009. · doi ↗
8[8] by same author, Trees , Handbook of enumerative combinatorics, Discrete Math. Appl. (Boca Raton), CRC Press, Boca Raton, FL, 2015, pp. 281–334.