A tree distinguishing polynomial
Pengyu Liu

TL;DR
This paper introduces a bivariate polynomial for unlabeled rooted and unrooted trees, proving it is a complete invariant for tree isomorphism, and generalizes the concept to unrooted trees.
Contribution
The paper defines a new polynomial invariant for unlabeled trees and proves its completeness for both rooted and unrooted cases, extending previous invariants.
Findings
The polynomial uniquely identifies unlabeled rooted trees.
The polynomial extends to unrooted trees as a complete invariant.
The polynomial serves as a generating function for certain subtrees.
Abstract
We define a bivariate polynomial for unlabeled rooted trees and show that the polynomial of an unlabeled rooted tree is the generating function of a class of subtrees of . We prove that the polynomial is a complete isomorphism invariant for unlabeled rooted trees. Then, we generalize the polynomial to unlabeled unrooted trees and we show that the generalized polynomial is a complete isomorphism invariant for unlabeled unrooted trees.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Combinatorial Mathematics · Luminescence and Fluorescent Materials · Synthesis and Properties of Aromatic Compounds
A tree distinguishing polynomial
and
Pengyu Liu
Abstract.
We define a bivariate polynomial for unlabeled rooted trees and show that the polynomial of an unlabeled rooted tree is the generating function of a class of subtrees of . We prove that the polynomial is a complete isomorphism invariant for unlabeled rooted trees. Then, we generalize the polynomial to unlabeled unrooted trees and we show that the generalized polynomial is a complete isomorphism invariant for unlabeled unrooted trees.
2010 Mathematics Subject Classification:
05C31 and 05C05
Keywords: Graph polynomial, Trees, Isomorphism invariant.
Department of Mathematics, Simon Fraser University. Burnaby, BC V5A 1S6, Canada.
E-mail: [email protected]
1. Introduction
Polynomial invariants are important tools in the study of graphs, knots and links. The Tutte polynomial [20] is the most investigated polynomial invariant for graphs. As an isomorphism invariant of graphs, the Tutte polynomial carries some information of a graph, for example, the chromatic polynomial and the number of spanning trees of the graph. The well known Jones polynomial [10] and the HOMFLY polynomial [9] are important invariants for knots and links which are related to the crossing number and the braid index of knots and links respectively [1]. However, the Tutte polynomial fails to distinguish trees. Actually, all trees with the same number of edges have the same Tutte polynomial. In the doctoral thesis of Law [11], polynomials were divided into three levels according to their tree distinguishing power, where the most powerful (level three) polynomial is the chromatic symmetric function introduced in 1995 by Stanley [19]. It is proved that the chromatic symmetric function distinguishes some classes of trees including spiders [13] and caterpillars [12], but it remains a conjecture that the chromatic symmetric function is a complete isomorphism invariant for trees. The -polynomial defined by Noble and Welsh [17], which is equivalent to the polychromate introduced by Brylawski [5, 14, 18], determines the chromatic symmetric function and vice versa when restricted to trees. See [17] and [2]. The strong polychromate defined by Bollobás and Riordan [4] is also equivalent to the polynomials above when restricted to trees. Hence, whether these polynomials are complete invariants for trees depends on whether the chromatic symmetric function is a complete invariant for trees. The level two polynomials are the polynomials that distinguish rooted trees. These polynomials include the subtree polynomial introduced by Chaudhary and Gordon [6], the Ising polynomial introduced by Andrén and Markström [3] and the Negami polynomial introduced by Negami and Ota [16]. However, it has been unknown to date whether there exists a polynomial that is a complete isomorphism invariant for unrooted trees.
In the emerging fields of phylogenetics and linguistics, the information carried by the shapes of trees needs to be analyzed and compared quantitively and accurately. A polynomial invariant, especially a complete invariant, for trees is a potentially convenient tool for this task because polynomials are well studied mathematical objects. With this motivation, we introduce a new polynomial that is a complete isomorphism invariant for trees, which, to our knowledge, is the first of its kind.
Before demonstrating the results, we clarify the terminology used in this paper. Trees are unlabeled unless otherwise stated. The order of a tree stands for the number of vertices in the tree. A monomial always has coefficient one and a term in a polynomial may have any integer coefficient. All of the internal vertices of a rooted -ary tree have exactly children. Similarly, every internal vertex of an unrooted -ary tree has degree . This paper is structured as follows. First, we define a polynomial for rooted trees and investigate the information about the trees carried by the polynomial. Then, we show that the polynomials for rooted trees are irreducibles in the polynomial ring and that the polynomial distinguishes rooted trees. Finally, we introduce a systematic way to generalize a polynomial that distinguishes rooted trees to a polynomial that distinguishes unrooted trees.
2. A polynomial for rooted trees
2.1. Definitions
Let be the set of rooted trees and be an integer. The rooted -star is the tree in of order with leaf vertices in which all of the leaf vertices are adjacent to the single internal vertex that is identified as the root. The -wedge operation is defined such that for rooted trees , ,…,, is the tree in constructed by pasting the roots of the trees to the leaf vertices of the rooted -star respectively. Note that the operation is permutative, that is, for any permutation , where is the symmetric group, . Besides, the rooted -ary trees can be constructed recursively using only . In particular, the rooted binary trees can be constructed recursively using only .
Definition 2.1**.**
Let be the trivial tree with one vertex and be an arbitrary tree in , where . The polynomial is defined by the following rules.
- (1)
, 2. (2)
.
For example, the rooted -star can be constructed by applying to trivial trees, so its polynomial is . Let be an integer. The rooted path of length is the path with vertices such that one of the leaf vertices is identified as the root. The rooted path of length can be constructed by applying the -wedge operation times starting from the trivial tree. According to the definition above, the polynomial of is .
Let be a rooted tree in and be an vertex of . The affix tree of to the vertex , denoted by , is the subtree of induced from the vertex and all of the descendants of . Note that if is a leaf vertex, then is the trivial tree with a single vertex. If the root vertex of has degree one, the branching vertex of is defined to be the nearest vertex to the root with degree greater than two. If does not have a vertex with degree greater than two, then it is a rooted path and it does not have a branching vertex. If the root vertex of has degree greater than one, then we consider that the root vertex is also the branching vertex. In this paper, the affix tree of to the branching vertex will be mentioned many times, hence, we denote this specific affix tree by . The stem of is the path from its root vertex to its branching vertex. See Figure 1 for an example. In particular, if the root vertex of has degree greater than one, that is, the root vertex is also the branching vertex, then the stem consists of only the root vertex and is of length zero.
Proposition 2.2**.**
The function is well defined.
Proof.
Denote the number of leaf vertices of a tree in by . We prove the proposition by strong induction on .
- (1)
If , the rooted trees in with a single leaf vertex are rooted paths. The polynomial of the rooted path of length is . If two rooted paths are isomorphic, they must have the same length, hence the same polynomial. 2. (2)
Assume that for , the polynomials of isomorphic trees are identical. 3. (3)
If , let and be two isomorphic trees in . implies that their stems are of the same length and their affix trees to the branching vertices and are also isomorphic, hence . Note that the numbers of leaf vertices in and are fewer than for all . Without loss of generality, we will compare with since the -wedge operations are permutative. implies for all . According to the hypothesis, . Assuming that the stems of and are of length , we can construct and by recursively applying the -wedge operation times to and . According to Definition 2.1, . ∎
Note that according to the proof above, a tree in with a stem of length has the polynomial , where is the affix tree of to the branching vertex.
To compute the polynomial of a tree in , we can apply the recurrence relation in Definition 2.1. Besides, the polynomial can also be computed directly using the Dyck word [8] of the tree by placing a symbol in every pair of parentheses that represents a leaf vertex and placing the symbol before the end of every pair of parentheses that represents an internal vertex. For example, if the Dyck word of a tree is then its polynomial should be . On the other hand, to reconstruct a tree from its polynomial, we may compute its Dyck word by recursively subtracting and factoring the rest of the polynomial. See [15] for methods to factor large multivariate polynomials.
2.2. Interpretation of the polynomial
To determine how the polynomial describes the features of trees in , we introduce the following definitions. Let be a tree in . A primary subtree of is a rooted subtree of such that shares the same root vertex with and any leaf vertex of is either a leaf vertex of or a descendant of a leaf vertex of . In other words, no leaf vertex of can be a descendant of an internal vertex of a primary subtree of . Denote the set of all primary subtrees of by . For any primary subtree of , we assign a monomial to the primary subtree if possesses leaf vertices and of them are leaf vertices of and of them are internal vertices of . Hence, the total degree of is the number of leaf vertices of . See Figure 1 for an example. Note that if a tree in is not the trivial tree, we always consider the root vertex of the tree as an internal vertex even if it has degree one. For example, the rooted -star has two primary subtrees. One is the trivial tree with only the root vertex, which corresponds to a monomial . The other is the tree that is isomorphic to the rooted -star, which corresponds to the monomial . The rooted path of length has primary subtrees, which are the paths from the root to each of its vertices including the root. If is the primary subtree from the root to the leaf vertex, then since the leaf vertex of is also the leaf vertex of . For any other primary subtree of , because the leaf vertex of any is an internal vertex of .
Let and be two subgraphs of a graph and , , , are the sets of vertices and edges of and respectively. We define the intersection and if and .
Lemma 2.3**.**
Let be a tree in and be the affix tree of to the branching vertex. The following statements about primary subtrees and the monomials are true.
- (1)
For any primary subtree of , if , then is a primary subtree of and . 2. (2)
If with , any primary subtree of is of form , where for all , except for the primary subtree which consists of only the root vertex of . 3. (3)
Suppose that with and be a primary subtree of such that , where and for all . Then which is still a monomial.
Proof.
For (1), if , then is a primary subtree on the stem of and . This is because is a rooted tree that shares the same root with and any leaf vertex or internal vertex of is also a leaf vertex or an internal vertex of respectively, hence, any leaf vertex of is a leaf vertex of or a descendant of a leaf vertex of and according to the definition of a primary subtree and the definition of the monomial, is a primary subtree of with . If , then contains no vertex of , that is, all the vertices of are on the stem. Therefore, is a rooted path and its leaf vertex is an internal vertex of and .
For (2), Note that any primary subtree of is rooted at the root vertex of and every leaf vertex of is a descendant of a leaf vertex of . Hence, for any , there exists at least one leaf vertex of in . Let be the set of leaf vertices of in and be the induced subtree of from and all the ancestors of vertices in . is a primary subtree of since is a primary subtree of and all the leaf vertices of are descendants of vertices in . Therefore, the primary subtree . Conversely, if with , any tree of form , where for all , is a primary subtree of according to the definition of primary subtrees.
For (3), the leaf vertices of that are leaf vertices of are also leaf vertices of and the leaf vertices of that are internal vertices of are also internal vertices of for any . ∎
Lemma 2.4**.**
.
Proof.
We prove the lemma by strong induction on , the number of leaf vertices of a tree in .
- (1)
If , we know, from Section 2.1, that the rooted trees in with a single leaf vertex are rooted paths and the polynomial of the rooted path of length is . On the other hand, we also know that has primary subtrees. The primary subtree that is isomorphic to has the monomial and any of the other primary subtrees has the monomial . Hence, . 2. (2)
Assume for all trees in with leaf vertices. 3. (3)
If , let be an arbitrary tree in with leaf vertices. Suppose has a stem of length and its affix tree to the branching vertex , where and is a tree in with fewer than leaf vertices for all . We know that according to Section 2.1. Note that the first fact above states that there is a partition of such that for any , if , then is a rooted path with vertices on the stem of . There exist such primary subtrees of , namely, the paths from the root to each of the vertices of the stem except for the branching vertex, which contributes to the term in the polynomial . Besides, for any , if , then and . Therefore, if we can prove that , then follows. According to Definition 2.1 and the induction hypothesis, . The monomial in corresponds to the primary subtree of since it is the only primary subtree of that has one leaf vertex in . For any monomial in , it is of form where for any . We know from the second and the third facts above that and is a primary subtree of . So every monomial in is a monomial in . On the other hand, for any except for the trivial primary subtree , where for any according to the second and the third facts above. For the primary subtree , and is a monomial in . For any other primary subtree , which is also a monomial in . Therefore, and . ∎
Lemma 2.4 shows that the polynomial of a tree can be interpreted as the generating function of the number of primary subtrees whose set of leaf vertices consists of leaf vertices of and internal vertices of .
Corollary 2.5**.**
Let be a tree in . Suppose has terms and are the corresponding coefficients, then has primary subtrees.
Let be a rooted tree with leaf vertices in . According to Lemma 2.4, there exists a term in which corresponds to the primary subtree that is isomorphic to , that is, all the leaf vertices of are leaf vertices of . There exists only one such primary subtree, hence the coefficient of the term is one. Moreover, for any other primary subtree of , has a factor because at least one leaf vertex of is not a leaf vertex of , that is, at least one leaf vertex of is a descendant of a leaf vertex of which is an internal vertex of . Last but not least, there exists at least one primary subtree with only one leaf vertex. The subtree of consisting of only the root vertex is such a primary subtree and it exists for any tree in . Note that the leaf vertex of such primary subtrees is always an internal vertex of except for the trivial tree. Therefore, if is not the trivial tree, then there is always a term in the polynomial , where is the number of primary subtrees with only one leaf vertex which is an internal vertex of . Indeed, if is a rooted path and otherwise, where is the length of the stem of . Besides, no primary subtrees of other than those on the stem can contribute to a monomial since if a primary subtree of contains vertices that are not on the stem of , must have at least two leaf vertices.
2.3. Complete isomorphism invariants for rooted trees
To prove that the polynomial is a complete isomorphism invariant for rooted trees, we need to prove the polynomials are irreducibles in . Eisenstein’s criterion states that if is an integral domain, is a prime ideal of and is a polynomial in , then is an irreducible in if , for all and .
Lemma 2.6**.**
For any tree in , is an irreducible in .
Proof.
We use Eisenstein’s criterion to prove this lemma. Note that . Let be the integral domain and be the prime ideal in . Suppose has leaf vertices. We know, from Section 2.2, that the leading term of is always and , hence, . For any primary subtree that is not isomorphic to , there is always a leaf vertex of that is an internal vertex of , so always has a factor and for all . Moreover, the constant term always contains a term . Therefore, and the polynomials for all rooted trees in are irreducibles in . ∎
Proposition 2.7**.**
The function is injective.
Proof.
We prove the proposition by strong induction on , the number of leaf vertices of a tree in .
- (1)
If , the polynomial of a rooted path of length is . Two non-isomorphic rooted paths have different lengths, so their polynomials are different. 2. (2)
Assume the function is injective for all . 3. (3)
If , let and be two non-isomorphic trees in with leaf vertices. Note that only the primary subtrees on the stem of a tree in contribute to the term. If the stems of and are of different lengths, the terms in the polynomials of and will have different coefficients hence . Suppose that the stems of and are of the same length, and where for all , and are rooted trees in with fewer than leaf vertices. If , then , that is, . Since these polynomials are irreducibles in according to Lemma 2.6 and is a unique factorization domain, we know and, without loss of generality, for all after a rearrangement of labels. Then, the hypothesis implies that for all , hence, . Note that the stems of and are of the same length. Therefore, which contradicts the assumption. ∎
Proposition 2.2 and Proposition 2.7 imply that the polynomial is a complete isomorphism invariant for rooted trees.
Theorem 2.8**.**
* are isomorphic if and only if .*
Let be the set of rooted trees such that every internal vertex has more than one child and be the set of rooted -ary trees where is an integer. For any tree in and any prime number , the polynomial is defined by substituting for in the polynomial . We can prove that the polynomial is an irreducible in for any tree in by substituting a prime number for in the proof of Lemma 2.6. Then, the following corollary follows the proof of Proposition 2.7.
Corollary 2.9**.**
are isomorphic if and only if . In particular, are isomorphic if and only if
However, for any prime number , there exists a pair of non-isomorphic rooted trees in with the same polynomial , where there exists at least one internal vertex that has only one child. Figure 2 shows a pair of such trees. For any prime number , we can choose the length of the stem of the tree to be . An interesting question is determine the values of the integer such that is a complete isomorphism invariant for rooted -ary trees or trees in . If , it can be checked by computer that is not a complete isomorphism invariant for rooted -ary or binary trees when . For any other integer, it is not known whether is a complete isomorphism invariant for rooted binary trees or not. It is not known either for other rooted -ary trees.
3. A tree distinguishing polynomial
3.1. The polynomial for unrooted trees
Let be the set of unrooted trees, and . Suppose is a tree in with leaf vertices. A leaf edge of is an edge of that is incident to a leaf vertex. For each leaf edge of , we can construct a rooted tree by contracting the leaf edge and identifying the contracted edge as the root vertex of . Denote the set of such rooted trees constructed from by . Note that has elements and some of them may be isomorphic.
Lemma 3.1**.**
are isomorphic if and only if there exists a bijection such that for any in , is isomorphic to .
Proof.
Suppose and is the isomorphism. Let be an arbitrary tree in and be the edge of that is contracted to attain . We define a function such that where is the tree in constructed by contracting . is a bijection because is a bijection between the set of leaf edges of and the set of leaf edges of . Conversely, Suppose that is a rooted tree in and is a rooted tree in . We can reconstruct and from and by recovering the contracted edges, that is, adding an edge and a leaf vertex to the root vertices of and respectively. Therefore, implies . ∎
Now, we generalize the polynomial in Definition 2.1 to in the following way. If a tree in is rooted, then is the polynomial defined as in Definition 2.1. If a tree is unrooted, then we define . For example, the polynomial of the unrooted -star is . Note that for any trees and in , if is rooted and is unrooted, we always consider that is not isomorphic to even if the only difference between and is an identified rooted vertex. We prove that the polynomial is a complete isomorphism invariant for trees.
Theorem 3.2**.**
are isomorphic if and only if .
Proof.
If , then either both of them are rooted or both of them are unrooted. If both of them are rooted, then follows Theorem 2.8. If both of them are unrooted, it follows from Lemma 3.1 and Theorem 2.8 that . On the other hand, if , we have three cases. First, if both of them are rooted, then follows from Theorem 2.8. Second, if one of them is rooted and the other is unrooted, then because the polynomial for the rooted tree is an irreducible in and the polynomial for the unrooted one is not. Third, if both of them are unrooted, then because otherwise Theorem 2.8 and being a unique factorization domain imply that there exists a bijection such that for any . This contradicts Lemma 3.1, hence, . ∎
The proof of Theorem 3.2 shows that whenever we have a polynomial that represents a class of rooted trees, if (i) the polynomial ring is a unique factorization domain, (ii) the polynomial is a complete isomorphism invariant for the class of rooted trees and (iii) the polynomials of rooted trees in the class are irreducibles in the polynomial ring, then we can generalize the polynomial to the corresponding class of unrooted trees and the resulting polynomial distinguishes these unrooted trees. In particular, the univariate polynomial for rooted -ary trees can be generalized to distinguish -ary trees. Let be the set of -ary trees including the rooted trees and the unrooted trees and the polynomial is defined such that for any in , if is rooted, then is defined as in Section 2.3 and if is unrooted, then .
Corollary 3.3**.**
are isomorphic if and only if .
Now we know that one variable is sufficient to uniquely represent -ary trees by polynomials. An interesting question is whether one variable is sufficient to uniquely represent all trees by polynomials. The zero loci of the polynomials for trees may also be interesting.
3.2. A generalization
A polynomial distinguishing leaf labeled trees has various applications in linguistics and mathematical biology especially in phylogenetics. The coefficients of a polynomial can be considered as a vector, so norms hence metrics of trees can be induced from tree distinguishing polynomials. Tree metrics, especially metrics for leaf labeled tree, have several biological applications, for example, to compare and classify phylogenetic tree shapes [7]. The polynomial can be generalized to represent leaf labeled rooted trees in a natural way. Given a tree and its polynomial , we can consider the tree as a vertex labeled tree such that each of its leaf vertices has a label and an internal vertex has the polynomial as its label, where is the affix tree of to the vertex . Thus the root vertex of has the label . If the leaf vertices of a tree have different labels and denotes the set of leaf labeled rooted trees, we define an analogous polynomial as follows.
Definition 3.4**.**
Let be the trivial tree with a single vertex that is labeled by and be an arbitrary tree in , where . The polynomial is defined by the following rules.
- (1)
, 2. (2)
Note that different leaf vertices may have the same label. If we set for all , then . The polynomial is a complete isomorphism invariant for leaf labeled rooted trees, where two leaf labeled trees in being isomorphic means not only that the unlabeled trees are isomorphic but also that the labels of the corresponding leaf vertices of the two trees are identical.
Corollary 3.5**.**
are isomorphic if and only if .
Proof.
To prove this corollary, we claim that if a polynomial in is an irreducible in then the polynomial in by changing each in to some is also an irreducible in . Then, the corollary follows the proof of Theorem 2.8. The proof of the claim is trivial because if a polynomial is not an irreducible, say , then by substituting any with in the equation , we have where and are in for all . This contradicts that is an irreducible in . Hence, the polynomial obtained by substituting in with some is an irreducible in .∎
Let be the set of leaf labeled unrooted trees and define . Since the polynomial is a complete isomorphism invariant for leaf labeled rooted trees and for any tree in , is an irreducible in the polynomial ring, according to Section 3.1, we can generalize the polynomial to a polynomial such that for any leaf labeled rooted tree, its polynomial is defined as in Definition 3.4 and for any leaf labeled unrooted tree , .
Corollary 3.6**.**
are isomorphic if and only if .
Proof.
To prove this corollary, we only need to generalize Lemma 3.1 for leaf labeled trees, that is, are isomorphic if and only if there exists a bijection such that for any in , is isomorphic to . Note that if is the isomorphism, then any leaf edge of should have the same label as the leaf edge of . Besides, for any , if is constructed by contracting a leaf edge of , we consider that the root vertex of is labeled and it is of the same label as the leaf edge of . Moreover, requires not only the corresponding leaf vertices but also the root vertices to have the same label. Thus, this can be proved similarly to the proof of Lemma 3.1. ∎
Acknowledgments
The author would like to thank Priscila Do Nascimento Biller, Caroline Colijn and Gábor Hetyei for helpful comments. This work was supported by the grant of the Federal Government of Canada’s Canada 150 Research Chair program to Dr. Caroline Colijn.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] C. Adams, The Knot Book. (1994), W. H. Freeman, New York.
- 2[2] J. Aliste-Prieto, A. de Mier and J. Zamora. On trees with the same restricted U-polynomial and the Prouhet-Tarry-Escott problem. Discrete Math. 340 (2017), 1435-1441.
- 3[3] D. Andrén and K. Markström, The bivariate Ising polynomial of a graph. Discrete Appl. Math. 157 (2009), 2515-2524.
- 4[4] B. Bollobás and O. Riordan, Polychromatic polynomials. Discrete Math. 219 (2000), 1-7.
- 5[5] T. Brylawski, Intersection theory for graphs. J. Combin. Theory B 30 (1981), 233-246.
- 6[6] S. Chaudhary and G. Gordon, Tutte polynomials for trees. J. Graph Theory 15 (1991), 317-331.
- 7[7] C. Colijn and G. Plazzotta, A metric on phylogenetic tree shapes. Syst. Biol. 67 (2018), 113-126.
- 8[8] É. Ghys, A singular mathematical promenade, Preprint , ar Xiv:1612.06373.
