A tree distinguishing polynomial

Pengyu Liu

arXiv:1904.03332·math.CO·February 13, 2020·Discret. Appl. Math.

A tree distinguishing polynomial

Pengyu Liu

PDF

Open Access 2 Repos

TL;DR

This paper introduces a bivariate polynomial for unlabeled rooted and unrooted trees, proving it is a complete invariant for tree isomorphism, and generalizes the concept to unrooted trees.

Contribution

The paper defines a new polynomial invariant for unlabeled trees and proves its completeness for both rooted and unrooted cases, extending previous invariants.

Findings

01

The polynomial uniquely identifies unlabeled rooted trees.

02

The polynomial extends to unrooted trees as a complete invariant.

03

The polynomial serves as a generating function for certain subtrees.

Abstract

We define a bivariate polynomial for unlabeled rooted trees and show that the polynomial of an unlabeled rooted tree $T$ is the generating function of a class of subtrees of $T$ . We prove that the polynomial is a complete isomorphism invariant for unlabeled rooted trees. Then, we generalize the polynomial to unlabeled unrooted trees and we show that the generalized polynomial is a complete isomorphism invariant for unlabeled unrooted trees.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Combinatorial Mathematics · Luminescence and Fluorescent Materials · Synthesis and Properties of Aromatic Compounds

Full text

A tree distinguishing polynomial

and

Pengyu Liu

Abstract.

We define a bivariate polynomial for unlabeled rooted trees and show that the polynomial of an unlabeled rooted tree $T$ is the generating function of a class of subtrees of $T$ . We prove that the polynomial is a complete isomorphism invariant for unlabeled rooted trees. Then, we generalize the polynomial to unlabeled unrooted trees and we show that the generalized polynomial is a complete isomorphism invariant for unlabeled unrooted trees.

2010 Mathematics Subject Classification:

05C31 and 05C05

Keywords: Graph polynomial, Trees, Isomorphism invariant.

Department of Mathematics, Simon Fraser University. Burnaby, BC V5A 1S6, Canada.

E-mail: [email protected]

1. Introduction

Polynomial invariants are important tools in the study of graphs, knots and links. The Tutte polynomial [20] is the most investigated polynomial invariant for graphs. As an isomorphism invariant of graphs, the Tutte polynomial carries some information of a graph, for example, the chromatic polynomial and the number of spanning trees of the graph. The well known Jones polynomial [10] and the HOMFLY polynomial [9] are important invariants for knots and links which are related to the crossing number and the braid index of knots and links respectively [1]. However, the Tutte polynomial fails to distinguish trees. Actually, all trees with the same number of edges have the same Tutte polynomial. In the doctoral thesis of Law [11], polynomials were divided into three levels according to their tree distinguishing power, where the most powerful (level three) polynomial is the chromatic symmetric function introduced in 1995 by Stanley [19]. It is proved that the chromatic symmetric function distinguishes some classes of trees including spiders [13] and caterpillars [12], but it remains a conjecture that the chromatic symmetric function is a complete isomorphism invariant for trees. The $U$ -polynomial defined by Noble and Welsh [17], which is equivalent to the polychromate introduced by Brylawski [5, 14, 18], determines the chromatic symmetric function and vice versa when restricted to trees. See [17] and [2]. The strong polychromate defined by Bollobás and Riordan [4] is also equivalent to the polynomials above when restricted to trees. Hence, whether these polynomials are complete invariants for trees depends on whether the chromatic symmetric function is a complete invariant for trees. The level two polynomials are the polynomials that distinguish rooted trees. These polynomials include the subtree polynomial introduced by Chaudhary and Gordon [6], the Ising polynomial introduced by Andrén and Markström [3] and the Negami polynomial introduced by Negami and Ota [16]. However, it has been unknown to date whether there exists a polynomial that is a complete isomorphism invariant for unrooted trees.

In the emerging fields of phylogenetics and linguistics, the information carried by the shapes of trees needs to be analyzed and compared quantitively and accurately. A polynomial invariant, especially a complete invariant, for trees is a potentially convenient tool for this task because polynomials are well studied mathematical objects. With this motivation, we introduce a new polynomial that is a complete isomorphism invariant for trees, which, to our knowledge, is the first of its kind.

Before demonstrating the results, we clarify the terminology used in this paper. Trees are unlabeled unless otherwise stated. The order of a tree stands for the number of vertices in the tree. A monomial always has coefficient one and a term in a polynomial may have any integer coefficient. All of the internal vertices of a rooted $m$ -ary tree have exactly $m$ children. Similarly, every internal vertex of an unrooted $m$ -ary tree has degree $m+1$ . This paper is structured as follows. First, we define a polynomial for rooted trees and investigate the information about the trees carried by the polynomial. Then, we show that the polynomials for rooted trees are irreducibles in the polynomial ring and that the polynomial distinguishes rooted trees. Finally, we introduce a systematic way to generalize a polynomial that distinguishes rooted trees to a polynomial that distinguishes unrooted trees.

2. A polynomial for rooted trees

2.1. Definitions

Let $\mathcal{T}_{r}$ be the set of rooted trees and $k\geq 1$ be an integer. The rooted $k$ -star is the tree in $\mathcal{T}_{r}$ of order $k+1$ with $k$ leaf vertices in which all of the leaf vertices are adjacent to the single internal vertex that is identified as the root. The $k$ -wedge operation $\wedge_{k}:\mathcal{T}_{r}^{k}\to\mathcal{T}_{r}$ is defined such that for $k$ rooted trees $T_{1}$ , $T_{2}$ ,…, $T_{k}$ , $\wedge_{k}(T_{1},T_{2},...,T_{k})$ is the tree in $\mathcal{T}_{r}$ constructed by pasting the roots of the $k$ trees to the $k$ leaf vertices of the rooted $k$ -star respectively. Note that the operation $\wedge_{k}$ is permutative, that is, for any permutation $\pi\in S_{k}$ , where $S_{k}$ is the symmetric group, $\wedge_{k}(T_{1},T_{2},...,T_{k})=\wedge_{k}(T_{\pi(1)},T_{\pi(2)},...,T_{\pi(k)})$ . Besides, the rooted $m$ -ary trees can be constructed recursively using only $\wedge_{m}$ . In particular, the rooted binary trees can be constructed recursively using only $\wedge_{2}$ .

Definition 2.1.

Let $\bullet$ be the trivial tree with one vertex and $T=\wedge_{k}(T_{1},T_{2},...,T_{k})$ be an arbitrary tree in $\mathcal{T}_{r}$ , where $k\geq 1$ . The polynomial $P:\mathcal{T}_{r}\to\mathbb{Z}[x,y]$ is defined by the following rules.

(1)

$P(\bullet)=x$ , 2. (2)

$P(T)=y+\prod_{i=1}^{k}P(T_{i})$ .

For example, the rooted $k$ -star $T$ can be constructed by applying $\wedge_{k}$ to $k$ trivial trees, so its polynomial is $P(T)=y+x^{k}$ . Let $l\geq 0$ be an integer. The rooted path of length $l$ is the path with $l+1$ vertices such that one of the leaf vertices is identified as the root. The rooted path $T$ of length $l$ can be constructed by applying the $1$ -wedge operation $l$ times starting from the trivial tree. According to the definition above, the polynomial of $T$ is $P(T)=ly+x$ .

Let $T$ be a rooted tree in $\mathcal{T}_{r}$ and $v$ be an vertex of $T$ . The affix tree of $T$ to the vertex $v$ , denoted by $T^{\prime}_{v}$ , is the subtree of $T$ induced from the vertex $v$ and all of the descendants of $v$ . Note that if $v$ is a leaf vertex, then $T^{\prime}_{v}$ is the trivial tree with a single vertex. If the root vertex of $T$ has degree one, the branching vertex of $T$ is defined to be the nearest vertex to the root with degree greater than two. If $T$ does not have a vertex with degree greater than two, then it is a rooted path and it does not have a branching vertex. If the root vertex of $T$ has degree greater than one, then we consider that the root vertex is also the branching vertex. In this paper, the affix tree of $T$ to the branching vertex will be mentioned many times, hence, we denote this specific affix tree by $T^{\prime}$ . The stem of $T$ is the path from its root vertex to its branching vertex. See Figure 1 for an example. In particular, if the root vertex of $T$ has degree greater than one, that is, the root vertex is also the branching vertex, then the stem consists of only the root vertex and is of length zero.

Proposition 2.2.

The function $P:\mathcal{T}_{r}\to\mathbb{Z}[x,y]$ is well defined.

Proof.

Denote the number of leaf vertices of a tree in $\mathcal{T}_{r}$ by $n$ . We prove the proposition by strong induction on $n$ .

(1)

If $n=1$ , the rooted trees in $\mathcal{T}_{r}$ with a single leaf vertex are rooted paths. The polynomial of the rooted path $T$ of length $l$ is $P(T)=ly+x$ . If two rooted paths are isomorphic, they must have the same length, hence the same polynomial. 2. (2)

Assume that for $n\leq N$ , the polynomials of isomorphic trees are identical. 3. (3)

If $n=N+1$ , let $T$ and $B$ be two isomorphic trees in $\mathcal{T}_{r}$ . $T\simeq B$ implies that their stems are of the same length and their affix trees to the branching vertices $T^{\prime}=\wedge_{k}(T_{1},T_{2},...,T_{k})$ and $B^{\prime}=\wedge_{l}(B_{1},B_{2},...,B_{l})$ are also isomorphic, hence $k=l>1$ . Note that the numbers of leaf vertices in $T_{i}$ and $B_{i}$ are fewer than $N+1$ for all $1\leq i\leq k$ . Without loss of generality, we will compare $T_{i}$ with $B_{i}$ since the $k$ -wedge operations are permutative. $T^{\prime}\simeq B^{\prime}$ implies $T_{i}\simeq B_{i}$ for all $1\leq i\leq k$ . According to the hypothesis, $P(T^{\prime})=y+\prod_{i=1}^{k}P(T_{i})=y+\prod_{i=1}^{k}P(B_{i})=P(B^{\prime})$ . Assuming that the stems of $T$ and $B$ are of length $l$ , we can construct $T$ and $B$ by recursively applying the $1$ -wedge operation $l$ times to $T^{\prime}$ and $B^{\prime}$ . According to Definition 2.1, $P(T)=ly+P(T^{\prime})=ly+P(B^{\prime})=P(B)$ . ∎

Note that according to the proof above, a tree $T$ in $\mathcal{T}_{r}$ with a stem of length $l$ has the polynomial $P(T)=ly+P(T^{\prime})$ , where $T^{\prime}$ is the affix tree of $T$ to the branching vertex.

To compute the polynomial of a tree in $\mathcal{T}_{r}$ , we can apply the recurrence relation in Definition 2.1. Besides, the polynomial can also be computed directly using the Dyck word [8] of the tree by placing a symbol $x$ in every pair of parentheses that represents a leaf vertex and placing the symbol $+y$ before the end of every pair of parentheses that represents an internal vertex. For example, if the Dyck word of a tree is $((()()))$ then its polynomial should be $(((x)(x)+y)+y)=x^{2}+2y$ . On the other hand, to reconstruct a tree from its polynomial, we may compute its Dyck word by recursively subtracting $y$ and factoring the rest of the polynomial. See [15] for methods to factor large multivariate polynomials.

2.2. Interpretation of the polynomial

To determine how the polynomial describes the features of trees in $\mathcal{T}_{r}$ , we introduce the following definitions. Let $T$ be a tree in $\mathcal{T}_{r}$ . A primary subtree $S$ of $T$ is a rooted subtree of $T$ such that $S$ shares the same root vertex with $T$ and any leaf vertex of $T$ is either a leaf vertex of $S$ or a descendant of a leaf vertex of $S$ . In other words, no leaf vertex of $T$ can be a descendant of an internal vertex of a primary subtree $S$ of $T$ . Denote the set of all primary subtrees of $T$ by $\mathcal{S}_{T}$ . For any primary subtree $S$ of $T$ , we assign a monomial $q(S)=x^{\alpha}y^{\beta}$ to the primary subtree if $S$ possesses $\alpha+\beta$ leaf vertices and $\alpha$ of them are leaf vertices of $T$ and $\beta$ of them are internal vertices of $T$ . Hence, the total degree of $q(S)$ is the number of leaf vertices of $S$ . See Figure 1 for an example. Note that if a tree in $\mathcal{T}_{r}$ is not the trivial tree, we always consider the root vertex of the tree as an internal vertex even if it has degree one. For example, the rooted $k$ -star has two primary subtrees. One is the trivial tree with only the root vertex, which corresponds to a monomial $q(S)=y$ . The other is the tree that is isomorphic to the rooted $k$ -star, which corresponds to the monomial $q(S)=x^{k}$ . The rooted path $T$ of length $l$ has $l+1$ primary subtrees, which are the paths from the root to each of its vertices including the root. If $S$ is the primary subtree from the root to the leaf vertex, then $q(S)=x$ since the leaf vertex of $S$ is also the leaf vertex of $T$ . For any other primary subtree $S$ of $T$ , $q(S)=y$ because the leaf vertex of any $S$ is an internal vertex of $T$ .

Let $G_{1}=(V_{1},E_{1})$ and $G_{2}=(V_{2},E_{2})$ be two subgraphs of a graph $G$ and $V_{1}$ , $V_{2}$ , $E_{1}$ , $E_{2}$ are the sets of vertices and edges of $G_{1}$ and $G_{2}$ respectively. We define the intersection $G_{1}\cap G_{2}=(V_{1}\cap V_{2},E_{1}\cap E_{2})$ and $G_{1}\cap G_{2}=\emptyset$ if $V_{1}\cap V_{2}=\emptyset$ and $E_{1}\cap E_{2}=\emptyset$ .

Lemma 2.3.

Let $T$ be a tree in $\mathcal{T}_{r}$ and $T^{\prime}$ be the affix tree of $T$ to the branching vertex. The following statements about primary subtrees and the monomials are true.

(1)

For any primary subtree $S$ of $T$ , if $S\cap T^{\prime}\neq\emptyset$ , then $S^{\prime}=S\cap T^{\prime}$ is a primary subtree of $T^{\prime}$ and $q(S)=q(S^{\prime})$ . 2. (2)

If $T=\wedge_{k}(T_{1},T_{2},...,T_{k})$ with $k>1$ , any primary subtree $S$ of $T$ is of form $\wedge_{k}(S_{1},S_{2},...,S_{k})$ , where $S_{i}\in\mathcal{S}_{T_{i}}$ for all $1\leq i\leq k$ , except for the primary subtree $S_{r}$ which consists of only the root vertex of $T$ . 3. (3)

Suppose that $T=\wedge_{k}(T_{1},T_{2},...,T_{k})$ with $k>1$ and $S$ be a primary subtree of $T$ such that $S=\wedge_{k}(S_{1},S_{2},...,S_{k})$ , where $S_{i}\in\mathcal{S}_{T_{i}}$ and $q(S_{i})=x^{\alpha_{i}}y^{\beta_{i}}$ for all $1\leq i\leq k$ . Then $q(S)=\prod_{i=1}^{k}q(S_{i})=x^{\sum_{i=1}^{k}\alpha_{i}}y^{\sum_{i=1}^{k}\beta_{i}}$ which is still a monomial.

Proof.

For (1), if $S\cap T^{\prime}=\emptyset$ , then $S$ is a primary subtree on the stem of $T$ and $q(S)=y$ . This is because $S^{\prime}=S\cap T^{\prime}\neq\emptyset$ is a rooted tree that shares the same root with $T^{\prime}$ and any leaf vertex or internal vertex of $T^{\prime}$ is also a leaf vertex or an internal vertex of $T$ respectively, hence, any leaf vertex of $T^{\prime}$ is a leaf vertex of $S^{\prime}$ or a descendant of a leaf vertex of $S^{\prime}$ and according to the definition of a primary subtree and the definition of the monomial, $S^{\prime}$ is a primary subtree of $T^{\prime}$ with $q(S)=q(S^{\prime})$ . If $S\cap T^{\prime}=\emptyset$ , then $S$ contains no vertex of $T^{\prime}$ , that is, all the vertices of $S$ are on the stem. Therefore, $S$ is a rooted path and its leaf vertex is an internal vertex of $T$ and $q(S)=y$ .

For (2), Note that any primary subtree $S\not\simeq S_{r}$ of $T$ is rooted at the root vertex of $T$ and every leaf vertex of $T$ is a descendant of a leaf vertex of $S$ . Hence, for any $1\leq i\leq k$ , there exists at least one leaf vertex of $S$ in $T_{i}$ . Let $L_{i}$ be the set of leaf vertices of $S$ in $T_{i}$ and $S_{i}$ be the induced subtree of $T_{i}$ from $L_{i}$ and all the ancestors of vertices in $L_{i}$ . $S_{i}$ is a primary subtree of $T_{i}$ since $S$ is a primary subtree of $T$ and all the leaf vertices of $T_{i}$ are descendants of vertices in $L_{i}$ . Therefore, the primary subtree $S=\wedge_{k}(S_{1},S_{2},...,S_{k})$ . Conversely, if $T=\wedge_{k}(T_{1},T_{2},...,T_{k})$ with $k>1$ , any tree $S$ of form $\wedge_{k}(S_{1},S_{2},...,S_{k})$ , where $S_{i}\in\mathcal{S}_{T_{i}}$ for all $1\leq i\leq k$ , is a primary subtree of $T$ according to the definition of primary subtrees.

For (3), the leaf vertices of $S_{i}$ that are leaf vertices of $T_{i}$ are also leaf vertices of $T$ and the leaf vertices of $S_{i}$ that are internal vertices of $T_{i}$ are also internal vertices of $T$ for any $1\leq i\leq k$ . ∎

Lemma 2.4.

$P(T)=\sum_{S\in\mathcal{S}_{T}}q(S)$ .

Proof.

We prove the lemma by strong induction on $n$ , the number of leaf vertices of a tree in $\mathcal{T}_{r}$ .

(1)

If $n=1$ , we know, from Section 2.1, that the rooted trees in $\mathcal{T}_{r}$ with a single leaf vertex are rooted paths and the polynomial of the rooted path $T$ of length $l$ is $P(T)=ly+x$ . On the other hand, we also know that $T$ has $l+1$ primary subtrees. The primary subtree $S$ that is isomorphic to $T$ has the monomial $q(S)=x$ and any of the other $l$ primary subtrees has the monomial $q(S)=y$ . Hence, $P(T)=\sum_{S\in\mathcal{S}_{T}}q(S)$ . 2. (2)

Assume $P(T)=\sum_{S\in\mathcal{S}_{T}}q(S)$ for all trees in $\mathcal{T}_{r}$ with $n\leq N$ leaf vertices. 3. (3)

If $n=N+1$ , let $T$ be an arbitrary tree in $\mathcal{T}_{r}$ with $N+1$ leaf vertices. Suppose $T$ has a stem of length $l$ and its affix tree to the branching vertex $T^{\prime}=\wedge_{k}(T_{1},T_{2},...,T_{k})$ , where $k>1$ and $T_{i}$ is a tree in $\mathcal{T}_{r}$ with fewer than $N+1$ leaf vertices for all $1\leq i\leq k$ . We know that $P(T)=ly+P(T^{\prime})$ according to Section 2.1. Note that the first fact above states that there is a partition of $\mathcal{S}_{T}$ such that for any $S\in\mathcal{S}_{T}$ , if $S\cap T^{\prime}=\emptyset$ , then $S$ is a rooted path with vertices on the stem of $T$ . There exist $l$ such primary subtrees of $T$ , namely, the paths from the root to each of the vertices of the stem except for the branching vertex, which contributes to the $ly$ term in the polynomial $P(T)$ . Besides, for any $S\in\mathcal{S}_{T}$ , if $S^{\prime}=S\cap T^{\prime}\neq\emptyset$ , then $S^{\prime}\in\mathcal{S}_{T^{\prime}}$ and $q(S)=q(S^{\prime})$ . Therefore, if we can prove that $P(T^{\prime})=\sum_{S\in\mathcal{S}_{T^{\prime}}}q(S)$ , then $P(T)=\sum_{S\in\mathcal{S}_{T}}q(S)$ follows. According to Definition 2.1 and the induction hypothesis, $P(T^{\prime})=y+\prod_{i=1}^{k}P(T_{i})=y+\prod_{i=1}^{k}\sum_{S\in\mathcal{S}_{T_{i}}}q(S)$ . The monomial $y$ in $P(T^{\prime})$ corresponds to the primary subtree $S_{r}$ of $T^{\prime}$ since it is the only primary subtree of $T$ that has one leaf vertex in $T^{\prime}$ . For any monomial in $\prod_{i=1}^{k}\sum_{S\in\mathcal{S}_{T_{i}}}q(S)$ , it is of form $q(S_{1})q(S_{2})...q(S_{k})$ where $S_{i}\in\mathcal{S}_{T_{i}}$ for any $1\leq i\leq k$ . We know from the second and the third facts above that $q(\wedge_{k}(S_{1},S_{2},...,S_{k}))=\prod_{i=1}^{k}q(S_{i})$ and $\wedge_{k}(S_{1},S_{2},...,S_{k})$ is a primary subtree of $T^{\prime}$ . So every monomial in $P(T^{\prime})$ is a monomial in $\sum_{S\in\mathcal{S}_{T^{\prime}}}q(S)$ . On the other hand, for any $S\in\mathcal{S}_{T^{\prime}}$ except for the trivial primary subtree $S_{r}$ , $S=\wedge_{k}(S_{1},S_{2},...,S_{k})$ where $S_{i}\in\mathcal{S}_{T_{i}}$ for any $1\leq i\leq k$ according to the second and the third facts above. For the primary subtree $S_{r}\in\mathcal{S}_{T^{\prime}}$ , $q(S_{r})=y$ and $y$ is a monomial in $P(T^{\prime})$ . For any other primary subtree $S\in\mathcal{S}_{T^{\prime}}$ , $q(S)=\prod_{i=1}^{k}q(S_{i})$ which is also a monomial in $P(T^{\prime})$ . Therefore, $P(T^{\prime})=\sum_{S\in\mathcal{S}_{T^{\prime}}}q(S)$ and $P(T)=\sum_{S\in\mathcal{S}_{T}}q(S)$ . ∎

Lemma 2.4 shows that the polynomial of a tree $T$ can be interpreted as the generating function of the number of primary subtrees whose set of leaf vertices consists of $\alpha$ leaf vertices of $T$ and $\beta$ internal vertices of $T$ .

Corollary 2.5.

Let $T$ be a tree in $\mathcal{T}_{r}$ . Suppose $P(T)$ has $m$ terms and $a_{1},a_{2},...,a_{m}$ are the corresponding coefficients, then $T$ has $\sum_{i=1}^{m}a_{i}$ primary subtrees.

Let $T$ be a rooted tree with $n$ leaf vertices in $\mathcal{T}_{r}$ . According to Lemma 2.4, there exists a $x^{n}$ term in $P(T)$ which corresponds to the primary subtree that is isomorphic to $T$ , that is, all the leaf vertices of $T$ are leaf vertices of $S$ . There exists only one such primary subtree, hence the coefficient of the term is one. Moreover, for any other primary subtree $S$ of $T$ , $q(S)$ has a factor $y$ because at least one leaf vertex of $T$ is not a leaf vertex of $S$ , that is, at least one leaf vertex of $T$ is a descendant of a leaf vertex of $S$ which is an internal vertex of $T$ . Last but not least, there exists at least one primary subtree with only one leaf vertex. The subtree $S_{r}$ of $T$ consisting of only the root vertex is such a primary subtree and it exists for any tree in $\mathcal{T}_{r}$ . Note that the leaf vertex of such primary subtrees is always an internal vertex of $T$ except for the trivial tree. Therefore, if $T$ is not the trivial tree, then there is always a term $ty$ in the polynomial $P(T)$ , where $t$ is the number of primary subtrees with only one leaf vertex which is an internal vertex of $T$ . Indeed, $t=l$ if $T$ is a rooted path and $t=l+1$ otherwise, where $l$ is the length of the stem of $T$ . Besides, no primary subtrees of $T$ other than those on the stem can contribute to a monomial $y$ since if a primary subtree $S$ of $T$ contains vertices that are not on the stem of $T$ , $S$ must have at least two leaf vertices.

2.3. Complete isomorphism invariants for rooted trees

To prove that the polynomial $P:\mathcal{T}_{r}\to\mathbb{Z}[x,y]$ is a complete isomorphism invariant for rooted trees, we need to prove the polynomials are irreducibles in $\mathbb{Z}[x,y]$ . Eisenstein’s criterion states that if $D$ is an integral domain, $I_{p}$ is a prime ideal of $D$ and $Q=a_{n}x^{n}+a_{n-1}x^{n-1}+...+a_{1}x+a_{0}$ is a polynomial in $D[x]$ , then $Q$ is an irreducible in $D[x]$ if $a_{n}\not\in I_{p}$ , for all $0\leq i<n$ $a_{i}\in I_{p}$ and $a_{0}\not\in I_{p}^{2}$ .

Lemma 2.6.

For any tree $T$ in $\mathcal{T}_{r}$ , $P(T)$ is an irreducible in $\mathbb{Z}[x,y]$ .

Proof.

We use Eisenstein’s criterion to prove this lemma. Note that $\mathbb{Z}[x,y]=\mathbb{Z}[y][x]$ . Let $\mathbb{Z}[y]$ be the integral domain $D$ and $I_{p}=\left\langle y\right\rangle$ be the prime ideal in $\mathbb{Z}[y]$ . Suppose $T$ has $n$ leaf vertices. We know, from Section 2.2, that the leading term of $P(T)$ is always $x^{n}$ and $a_{n}=1$ , hence, $a_{n}\not\in I_{p}$ . For any primary subtree $S$ that is not isomorphic to $T$ , there is always a leaf vertex of $S$ that is an internal vertex of $T$ , so $q(S)$ always has a factor $y$ and $a_{i}\in I_{p}$ for all $0\leq i<n$ . Moreover, the constant term $a_{0}$ always contains a term $ty$ . Therefore, $a_{0}\not\in I_{p}^{2}$ and the polynomials for all rooted trees in $\mathcal{T}_{r}$ are irreducibles in $\mathbb{Z}[x,y]$ . ∎

Proposition 2.7.

The function $P:\mathcal{T}_{r}\to\mathbb{Z}[x,y]$ is injective.

Proof.

We prove the proposition by strong induction on $n$ , the number of leaf vertices of a tree in $\mathcal{T}_{r}$ .

(1)

If $n=1$ , the polynomial of a rooted path $T$ of length $l$ is $P(T)=ly+x$ . Two non-isomorphic rooted paths have different lengths, so their polynomials are different. 2. (2)

Assume the function is injective for all $n\leq N$ . 3. (3)

If $n=N+1$ , let $T$ and $B$ be two non-isomorphic trees in $\mathcal{T}_{r}$ with $N+1$ leaf vertices. Note that only the primary subtrees on the stem of a tree in $\mathcal{T}_{r}$ contribute to the $ty$ term. If the stems of $T$ and $B$ are of different lengths, the $ty$ terms in the polynomials of $T$ and $B$ will have different coefficients hence $P(T)\neq P(B)$ . Suppose that the stems of $T$ and $B$ are of the same length, $T^{\prime}=\wedge_{k}(T_{1},T_{2},...,T_{k})$ and $B^{\prime}=\wedge_{l}(B_{1},B_{2},..,B_{l})$ where for all $1\leq i\leq k$ , $T_{i}$ and $B_{i}$ are rooted trees in $\mathcal{T}_{r}$ with fewer than $N+1$ leaf vertices. If $P(T)=P(B)$ , then $P(T^{\prime})=y+\prod_{i=1}^{k}P(T_{i})=y+\prod_{j=1}^{l}P(B_{i})=P(B^{\prime})$ , that is, $P(T_{1})P(T_{2})...P(T_{k})=P(B_{1})P(B_{2})...P(B_{l})$ . Since these polynomials are irreducibles in $\mathbb{Z}[x,y]$ according to Lemma 2.6 and $\mathbb{Z}[x,y]$ is a unique factorization domain, we know $k=l$ and, without loss of generality, $P(T_{i})=P(B_{i})$ for all $1\leq i\leq k$ after a rearrangement of labels. Then, the hypothesis implies that $T_{i}\simeq B_{i}$ for all $1\leq i\leq k$ , hence, $T^{\prime}\simeq B^{\prime}$ . Note that the stems of $T$ and $B$ are of the same length. Therefore, $T\simeq B$ which contradicts the assumption. ∎

Proposition 2.2 and Proposition 2.7 imply that the polynomial is a complete isomorphism invariant for rooted trees.

Theorem 2.8.

$T_{1},T_{2}\in\mathcal{T}_{r}$ * are isomorphic if and only if $P(T_{1})=P(T_{2})$ .*

Let $\mathcal{T}_{r}^{*}$ be the set of rooted trees such that every internal vertex has more than one child and $\mathcal{T}_{r}^{m}$ be the set of rooted $m$ -ary trees where $m\geq 2$ is an integer. For any tree $T$ in $\mathcal{T}_{r}$ and any prime number $p$ , the polynomial $P_{p}:\mathcal{T}_{r}\to\mathbb{Z}[x]$ is defined by substituting $p$ for $y$ in the polynomial $P(T)$ . We can prove that the polynomial $P_{p}(T)$ is an irreducible in $\mathbb{Z}[x]$ for any tree $T$ in $\mathcal{T}_{r}$ by substituting a prime number $p$ for $y$ in the proof of Lemma 2.6. Then, the following corollary follows the proof of Proposition 2.7.

Corollary 2.9.

$T_{1},T_{2}\in\mathcal{T}_{r}^{*}$ are isomorphic if and only if $P_{p}(T_{1})=P_{p}(T_{2})$ . In particular, $T_{1},T_{2}\in\mathcal{T}_{r}^{m}$ are isomorphic if and only if $P_{p}(T_{1})=P_{p}(T_{2})$

However, for any prime number $p$ , there exists a pair of non-isomorphic rooted trees in $\mathcal{T}_{r}$ with the same polynomial $P_{p}(T):\mathcal{T}_{r}\to\mathbb{Z}[x]$ , where there exists at least one internal vertex that has only one child. Figure 2 shows a pair of such trees. For any prime number $p$ , we can choose the length of the stem of the tree $T_{1}$ to be $l=p$ $P_{p}(T_{1})=P_{p}(T_{2})$ . An interesting question is determine the values of the integer $n$ such that $P_{n}(T):\mathcal{T}_{r}\to\mathbb{Z}[x]$ is a complete isomorphism invariant for rooted $m$ -ary trees or trees in $\mathcal{T}_{r}^{*}$ . If $m=2$ , it can be checked by computer that $P_{n}(T):\mathcal{T}_{r}\to\mathbb{Z}[x]$ is not a complete isomorphism invariant for rooted $2$ -ary or binary trees when $n\in\{-1,0,1\}$ . For any other integer, it is not known whether $P_{n}(T):\mathcal{T}_{r}\to\mathbb{Z}[x]$ is a complete isomorphism invariant for rooted binary trees or not. It is not known either for other rooted $m$ -ary trees.

3. A tree distinguishing polynomial

3.1. The polynomial for unrooted trees

Let $\mathcal{T}_{u}$ be the set of unrooted trees, and $\mathcal{T}=\mathcal{T}_{u}\cup\mathcal{T}_{r}$ . Suppose $T$ is a tree in $\mathcal{T}_{u}$ with $n$ leaf vertices. A leaf edge of $T$ is an edge of $T$ that is incident to a leaf vertex. For each leaf edge of $T$ , we can construct a rooted tree $T_{i}$ by contracting the leaf edge and identifying the contracted edge as the root vertex of $T_{i}$ . Denote the set of such rooted trees constructed from $T$ by $\mathcal{R}_{T}$ . Note that $\mathcal{R}_{T}$ has $n$ elements and some of them may be isomorphic.

Lemma 3.1.

$T,B\in\mathcal{T}_{u}$ are isomorphic if and only if there exists a bijection $h:\mathcal{R}_{T}\to\mathcal{R}_{B}$ such that for any $T_{i}$ in $\mathcal{R}_{T}$ , $T_{i}$ is isomorphic to $h(T_{i})$ .

Proof.

Suppose $T\simeq B$ and $\phi:T\to B$ is the isomorphism. Let $T_{i}$ be an arbitrary tree in $\mathcal{R}_{T}$ and $e$ be the edge of $T$ that is contracted to attain $T_{i}$ . We define a function $h:\mathcal{R}_{T}\to\mathcal{R}_{B}$ such that $h(T_{i})=B_{j}$ where $B_{j}$ is the tree in $\mathcal{R}_{B}$ constructed by contracting $\phi(e)$ . $h:\mathcal{R}_{T}\to\mathcal{R}_{B}$ is a bijection because $\phi:T\to B$ is a bijection between the set of leaf edges of $T$ and the set of leaf edges of $B$ . Conversely, Suppose that $T_{i}$ is a rooted tree in $\mathcal{R}_{T}$ and $B_{j}=h(T_{i})$ is a rooted tree in $\mathcal{R}_{B}$ . We can reconstruct $T$ and $B$ from $T_{i}$ and $B_{j}$ by recovering the contracted edges, that is, adding an edge and a leaf vertex to the root vertices of $T_{i}$ and $B_{j}$ respectively. Therefore, $T_{i}\simeq B_{j}$ implies $T\simeq B$ . ∎

Now, we generalize the polynomial in Definition 2.1 to $P:\mathcal{T}\to\mathbb{Z}[x,y]$ in the following way. If a tree $T$ in $\mathcal{T}$ is rooted, then $P(T)$ is the polynomial defined as in Definition 2.1. If a tree $T$ is unrooted, then we define $P(T)=\prod_{T_{i}\in\mathcal{R}_{T}}P(T_{i})$ . For example, the polynomial of the unrooted $3$ -star is $(x^{2}+y)^{3}$ . Note that for any trees $T_{1}$ and $T_{2}$ in $\mathcal{T}$ , if $T_{1}$ is rooted and $T_{2}$ is unrooted, we always consider that $T_{1}$ is not isomorphic to $T_{2}$ even if the only difference between $T_{1}$ and $T_{2}$ is an identified rooted vertex. We prove that the polynomial $P:\mathcal{T}\to\mathbb{Z}[x,y]$ is a complete isomorphism invariant for trees.

Theorem 3.2.

$T_{1},T_{2}\in\mathcal{T}$ are isomorphic if and only if $P(T_{1})=P(T_{2})$ .

Proof.

If $T_{1}\simeq T_{2}$ , then either both of them are rooted or both of them are unrooted. If both of them are rooted, then $P(T_{1})=P(T_{2})$ follows Theorem 2.8. If both of them are unrooted, it follows from Lemma 3.1 and Theorem 2.8 that $P(T_{1})=P(T_{2})$ . On the other hand, if $T_{1}\not\simeq T_{2}$ , we have three cases. First, if both of them are rooted, then $P(T_{1})\neq P(T_{2})$ follows from Theorem 2.8. Second, if one of them is rooted and the other is unrooted, then $P(T_{1})\neq P(T_{2})$ because the polynomial for the rooted tree is an irreducible in $\mathbb{Z}[x,y]$ and the polynomial for the unrooted one is not. Third, if both of them are unrooted, then $P(T_{1})\neq P(T_{2})$ because otherwise Theorem 2.8 and $\mathbb{Z}[x,y]$ being a unique factorization domain imply that there exists a bijection $h:\mathcal{R}_{T_{1}}\to\mathcal{R}_{T_{2}}$ such that $T\simeq h(T)$ for any $T\in\mathcal{R}_{T_{i}}$ . This contradicts Lemma 3.1, hence, $P(T_{1})\neq P(T_{2})$ . ∎

The proof of Theorem 3.2 shows that whenever we have a polynomial that represents a class of rooted trees, if (i) the polynomial ring is a unique factorization domain, (ii) the polynomial is a complete isomorphism invariant for the class of rooted trees and (iii) the polynomials of rooted trees in the class are irreducibles in the polynomial ring, then we can generalize the polynomial to the corresponding class of unrooted trees and the resulting polynomial distinguishes these unrooted trees. In particular, the univariate polynomial $P_{p}(T):\mathcal{T}_{r}^{m}\to\mathbb{Z}[x]$ for rooted $m$ -ary trees can be generalized to distinguish $m$ -ary trees. Let $\mathcal{T}^{m}$ be the set of $m$ -ary trees including the rooted trees and the unrooted trees and the polynomial $P_{p}(T):\mathcal{T}^{m}\to\mathbb{Z}[x]$ is defined such that for any $T$ in $\mathcal{T}^{m}$ , if $T$ is rooted, then $P_{p}(T)$ is defined as in Section 2.3 and if $T$ is unrooted, then $P_{p}(T)=\prod_{T_{i}\in\mathcal{R}_{T}}P_{p}(T_{i})$ .

Corollary 3.3.

$T_{1},T_{2}\in\mathcal{T}^{m}$ are isomorphic if and only if $P_{p}(T_{1})=P_{p}(T_{2})$ .

Now we know that one variable is sufficient to uniquely represent $m$ -ary trees by polynomials. An interesting question is whether one variable is sufficient to uniquely represent all trees by polynomials. The zero loci of the polynomials $P:\mathcal{T}\to\mathbb{Z}[x,y]$ for trees may also be interesting.

3.2. A generalization

A polynomial distinguishing leaf labeled trees has various applications in linguistics and mathematical biology especially in phylogenetics. The coefficients of a polynomial can be considered as a vector, so norms hence metrics of trees can be induced from tree distinguishing polynomials. Tree metrics, especially metrics for leaf labeled tree, have several biological applications, for example, to compare and classify phylogenetic tree shapes [7]. The polynomial $P:\mathcal{T}_{r}\to\mathbb{Z}[x,y]$ can be generalized to represent leaf labeled rooted trees in a natural way. Given a tree $T\in\mathcal{T}_{r}$ and its polynomial $P(T)$ , we can consider the tree $T$ as a vertex labeled tree such that each of its leaf vertices has a label $x$ and an internal vertex $v$ has the polynomial $P(T^{\prime}_{v})$ as its label, where $T^{\prime}_{v}$ is the affix tree of $T$ to the vertex $v$ . Thus the root vertex of $T$ has the label $P(T)$ . If the leaf vertices of a tree have different labels and $\mathcal{T}_{r}^{\ell}$ denotes the set of leaf labeled rooted trees, we define an analogous polynomial as follows.

Definition 3.4.

Let $\bullet_{i}$ be the trivial tree with a single vertex that is labeled by $i$ and $T=\wedge_{k}(T_{1},T_{2},...,T_{k})$ be an arbitrary tree in $\mathcal{T}_{r}^{\ell}$ , where $k\geq 1$ . The polynomial $P_{\ell}:\mathcal{T}_{r}^{\ell}\to\mathbb{Z}[x_{1},x_{2},...,x_{t},y]$ is defined by the following rules.

(1)

$P_{\ell}(\bullet_{i})=x_{i}$ , 2. (2)

$P_{\ell}(T)=y+\prod_{i=1}^{k}P_{\ell}(T_{i})$

Note that different leaf vertices may have the same label. If we set $x_{i}=x$ for all $1\leq i\leq t$ , then $P_{\ell}(T)=P(T)$ . The polynomial $P_{\ell}:\mathcal{T}_{r}^{\ell}\to\mathbb{Z}[x_{1},x_{2},...,x_{t},y]$ is a complete isomorphism invariant for leaf labeled rooted trees, where two leaf labeled trees in $\mathcal{T}_{r}^{\ell}$ being isomorphic means not only that the unlabeled trees are isomorphic but also that the labels of the corresponding leaf vertices of the two trees are identical.

Corollary 3.5.

$T_{1},T_{2}\in\mathcal{T}_{r}^{\ell}$ are isomorphic if and only if $P_{\ell}(T_{1})=P_{\ell}(T_{2})$ .

Proof.

To prove this corollary, we claim that if a polynomial $P$ in $\mathbb{Z}[x,y]$ is an irreducible in $\mathbb{Z}[x,y]$ then the polynomial $Q$ in $\mathbb{Z}[x_{1},x_{2},...,x_{t},y]$ by changing each $x$ in $P$ to some $x_{i}$ is also an irreducible in $\mathbb{Z}[x_{1},x_{2},...,x_{t},y]$ . Then, the corollary follows the proof of Theorem 2.8. The proof of the claim is trivial because if a polynomial $Q\in\mathbb{Z}[x_{1},x_{2},...,x_{t},y]$ is not an irreducible, say $Q=\prod_{i=1}^{k}Q_{i}$ , then by substituting any $x_{i}$ with $x$ in the equation $Q=\prod_{i=1}^{k}Q_{i}$ , we have $P=\prod_{i=1}^{k}P_{i}$ where $P$ and $P_{i}$ are in $\mathbb{Z}[x,y]$ for all $1\leq i\leq k$ . This contradicts that $P$ is an irreducible in $\mathbb{Z}[x,y]$ . Hence, the polynomial $Q$ obtained by substituting $x$ in $P$ with some $x_{i}$ is an irreducible in $\mathbb{Z}[x_{1},x_{2},...,x_{t},y]$ .∎

Let $\mathcal{T}_{u}^{\ell}$ be the set of leaf labeled unrooted trees and define $\mathcal{T}^{\ell}=\mathcal{T}_{u}^{\ell}\cup\mathcal{T}_{r}^{\ell}$ . Since the polynomial $P_{\ell}:\mathcal{T}_{r}^{\ell}\to\mathbb{Z}[x_{1},x_{2},...,x_{t},y]$ is a complete isomorphism invariant for leaf labeled rooted trees and for any tree $T$ in $\mathcal{T}_{r}^{\ell}$ , $P_{\ell}(T)$ is an irreducible in the polynomial ring, according to Section 3.1, we can generalize the polynomial to a polynomial $P_{\ell}:\mathcal{T}^{\ell}\to\mathbb{Z}[x_{1},x_{2},...,x_{t},y]$ such that for any leaf labeled rooted tree, its polynomial is defined as in Definition 3.4 and for any leaf labeled unrooted tree $T$ , $P_{\ell}(T)=\prod_{T_{i}\in\mathcal{R}_{T}}P_{\ell}(T_{i})$ .

Corollary 3.6.

$T_{1},T_{2}\in\mathcal{T}^{\ell}$ are isomorphic if and only if $P_{\ell}(T_{1})=P_{\ell}(T_{2})$ .

Proof.

To prove this corollary, we only need to generalize Lemma 3.1 for leaf labeled trees, that is, $T,B\in\mathcal{T}_{u}^{\ell}$ are isomorphic if and only if there exists a bijection $h:\mathcal{R}_{T}\to\mathcal{R}_{B}$ such that for any $T_{i}$ in $\mathcal{R}_{T}$ , $T_{i}$ is isomorphic to $h(T_{i})$ . Note that if $\phi:T\to B$ is the isomorphism, then any leaf edge $e$ of $T$ should have the same label as the leaf edge $\phi(e)$ of $B$ . Besides, for any $T_{i}\in\mathcal{R}_{T}$ , if $T_{i}$ is constructed by contracting a leaf edge $e$ of $T$ , we consider that the root vertex of $T_{i}$ is labeled and it is of the same label as the leaf edge $e$ of $T$ . Moreover, $T_{i}\simeq h(T_{i})$ requires not only the corresponding leaf vertices but also the root vertices to have the same label. Thus, this can be proved similarly to the proof of Lemma 3.1. ∎

Acknowledgments

The author would like to thank Priscila Do Nascimento Biller, Caroline Colijn and Gábor Hetyei for helpful comments. This work was supported by the grant of the Federal Government of Canada’s Canada 150 Research Chair program to Dr. Caroline Colijn.

Bibliography20

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] C. Adams, The Knot Book. (1994), W. H. Freeman, New York.
2[2] J. Aliste-Prieto, A. de Mier and J. Zamora. On trees with the same restricted U-polynomial and the Prouhet-Tarry-Escott problem. Discrete Math. 340 (2017), 1435-1441.
3[3] D. Andrén and K. Markström, The bivariate Ising polynomial of a graph. Discrete Appl. Math. 157 (2009), 2515-2524.
4[4] B. Bollobás and O. Riordan, Polychromatic polynomials. Discrete Math. 219 (2000), 1-7.
5[5] T. Brylawski, Intersection theory for graphs. J. Combin. Theory B 30 (1981), 233-246.
6[6] S. Chaudhary and G. Gordon, Tutte polynomials for trees. J. Graph Theory 15 (1991), 317-331.
7[7] C. Colijn and G. Plazzotta, A metric on phylogenetic tree shapes. Syst. Biol. 67 (2018), 113-126.
8[8] É. Ghys, A singular mathematical promenade, Preprint , ar Xiv:1612.06373.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Code & Models

Videos

Taxonomy

A tree distinguishing polynomial

Abstract.

2010 Mathematics Subject Classification:

1. Introduction

2. A polynomial for rooted trees

2.1. Definitions

Definition 2.1**.**

Proposition 2.2**.**

Proof.

2.2. Interpretation of the polynomial

Lemma 2.3**.**

Proof.

Lemma 2.4**.**

Proof.

Corollary 2.5**.**

2.3. Complete isomorphism invariants for rooted trees

Lemma 2.6**.**

Proof.

Proposition 2.7**.**

Proof.

Theorem 2.8**.**

Corollary 2.9**.**

3. A tree distinguishing polynomial

3.1. The polynomial for unrooted trees

Lemma 3.1**.**

Proof.

Theorem 3.2**.**

Proof.

Corollary 3.3**.**

3.2. A generalization

Definition 3.4**.**

Corollary 3.5**.**

Proof.

Corollary 3.6**.**

Proof.

Acknowledgments

Definition 2.1.

Proposition 2.2.

Lemma 2.3.

Lemma 2.4.

Corollary 2.5.

Lemma 2.6.

Proposition 2.7.

Theorem 2.8.

Corollary 2.9.

Lemma 3.1.

Theorem 3.2.

Corollary 3.3.

Definition 3.4.

Corollary 3.5.

Corollary 3.6.