Phylogenetic complexity of the Kimura 3-parameter model
Mateusz Micha{\l}ek, Emanuele Ventura

TL;DR
This paper proves that the algebraic ideals of the Kimura 3-parameter phylogenetic model are generated in degree four, confirming a longstanding conjecture in algebraic statistics.
Contribution
It establishes that the ideals for this model are generated in degree four, resolving a key conjecture by Sturmfels and Sullivant.
Findings
Ideals are generated in degree four
Confirmed a conjecture by Sturmfels and Sullivant
Advances understanding of algebraic structure of phylogenetic models
Abstract
In algebraic statistics, the Kimura 3-parameter model is one of the most interesting and classical phylogenetic models. We prove that the ideals associated to this model are generated in degree four, confirming a conjecture by Sturmfels and Sullivant.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Phylogenetic complexity of the Kimura -parameter model
Mateusz Michałek and Emanuele Ventura
Abstract
In algebraic statistics, the Kimura -parameter model is one of the most interesting and classical phylogenetic models. We prove that the ideals associated to this model are generated in degree four, confirming a conjecture by Sturmfels and Sullivant.
2010 Mathematics Subject Classification. Primary 52B20, Secondary 14M25, 13P25
1 Introduction
The part of computational biology that models evolution and describes mutations in this process is called phylogenetics [40]. This is a fertile subject witnessing many connections to several parts of mathematics such as algebraic geometry [8, 23], combinatorics [4, 15, 34], and representation theory [9, 31]. The methods used in this context of research are powerful and do not only apply to biology, but are employed in several other fields [2] such as modeling changes of words in languages [21], literary studies [3] or linguistics itself [37] with ideas going back to Darwin [14].
A crucial object in phylogenetics is a tree model, which is a parametric family of probability distributions. It consists of a tree , a finite set of states and a family of transition matrices, usually given by a linear subspaces of all matrices. The case of particular interest is when , where the basis elements correspond to the four nucleobases of DNA: adenine (A), cytosine (C), guanine (G), and thymine (T).
The models for which is a proper subspace of matrices reflect some symmetries among elements of . These symmetries are usually encoded by the action of a finite group on . In these terms, can be regarded as the space of -invariant matrices or tensors. Such models constitute a class of interest and they are called equivariant [18]. If is the trivial group, we obtain the general Markov model, corresponding, on the algebraic geometry side, to secant varieties of Segre products. When the elements of can be identified with those of , the model is called group-based. Henceforth we assume to be abelian.
The simplest among the equivariant, and group-based, models is the Cavender-Farris-Neyman model. This is the instance for , the group with two elements. A good understanding of this model from the algebraic geometry point of view has led to tremendous advances in this field. Sturmfels and Sullivant [41, Theorem 28] showed that the algebraic varieties arising from it are defined by quadrics. Additionally, Buczyńska and Wiśniewski described many of its remarkable algebro-geometric properties [8]. Consequently, Sturmfels and Xu [44], and Manon [32] described the connections of the model to toric degenerations of moduli spaces of rank two vector bundles on marked curves of fixed genus. For more relations to conformal field theory, we refer to [29, 31].
The Cavender-Farris-Neyman model is the simplest among the hyperbinary models [6, Section 3], that are given by . The most biologically meaningful example of those is the Kimura -parameter model; this corresponds to . In this case, , and, moreover, the action of reflects the pairing between purines (A,G) and pyrimidines (C,T). This model was introduced by Kimura [28] much before the setting above was developed. Using numerical experiments, Sturmfels and Sullivant conjectured that the ideals of the algebraic varieties associated to this model are generated by polynomials of degree at most four [41, Conjecture 30]. The confirmation of this conjecture is the main result of the present article. For any group , Sturmfels and Sullivant defined the phylogenetic complexity of .
Definition 1.1** (Phylogenetic complexity [41]).**
Let be the star with leaves, and the variety associated to the group-based model. Let be the maximal degree of a generator in a minimal generating set of the ideal . The phylogenetic complexity of is .
In [35], it was shown that for any abelian group , its phylogenetic complexity is finite. The main contribution of this article is a more detailed study of the phylogenetic complexity of .
Main Theorem**.**
The phylogenetic complexity of the Kimura -parameter model equals four.
For more interesting results on the Kimura -parameter model we refer to [9, 10, 11, 30].
Algebraic varieties associated to a model.
We recall the explicit construction of the algebraic variety associated to a model. It is the Zariski closure of the locus of all probability distributions on the states of leaves allowed in the model.
A representation of a model on a tree is an association of transition matrices to edges of . The set of all representations is denoted by . (Here we do not mention the root distribution, since it does not affect the family of probability distributions we obtain.) To each vertex of we associate an dimensional vector space with basis . We may regard an element of associated to an edge as an element of the tensor product . We fix a representation and an association . Here is the set of leaves, i.e. vertices of degree one, of . Following the usual Markov rule, we may compute the probability of :
[TABLE]
where is the set of vertices of . We may identify with a basis element of . This provides the map:
[TABLE]
The image of this map is the family of probability distributions described by the model and its Zariski closure is the algebraic variety that represents the model. For group-based models, we denote this variety , where is the group defining the model and is the tree as above.
Earlier contributions.
Our proof of the main theorem relies on previous results by many authors that we now recall.
The first fundamental tool is the Discrete Fourier Transform. This is a linear change of coordinates, based on the representation theory of . For special cases in phylogenetics, it was first used by Hendy and Penny [26], and by Erdös, Székely, and Steel [42]. In higher generality, it is treated in [33, 41]. For group-based models, the DFT turns into a monomial map, proving that the associated algebraic variety is a toric variety. This translates the classical algebraic problem of finding defining equations of a variety into a combinatorial one. For more information about toric methods we refer to [12, 25, 43].
Another key result is the reduction from arbitrary trees to the so-called stars or claw-trees , i.e., trees with one inner vertex and leaves. The general procedure for group-based models to obtain ideals arising from arbitrary trees, knowing the ideals for , was discovered in [41]. Again, this turned out to be very influential, leading, on one hand, to the general constructions of toric fiber products [31, 45], and, on the other, to generalizations for equivariant models [18].
Combinatorial and computational methods in toric geometry are very well developed. As a starting point in our article we need to compute algebraic invariants of toric varieties embedded in very high dimensional ambient spaces. Here the computer algebra packages Normaliz [7], 4ti2 [47], along with previous computational results from [16] and [41] are used. In particular, Castenluovo-Mumford regularity plays a crucial role in the proof for . These classical invariants are briefly discussed in the Appendix 4, for the sake of completeness.
This work may be also seen in the framework of the stabilisation of equations of a family of algebraic varieties. Indeed, our proof not only bounds the degrees of the generators, but in principle provides an inductive procedure to obtain all generators in case of , assuming the generators for to be known. Finding equations of an infinite sequence of algebraic varieties, that come naturally in families, is an interesting current theme of research. This usually involves classical varieties such as secants of Segre varieties [19] and Grassmannians [20]. Indeed, the main result of Draisma and Eggermont in [17] shows that for equivariant models the associated algebraic variety can always be defined set-theoretically in some bounded degree, once and are both fixed. The fact that is finite constitutes the main result of [35]. Recently, another ideal-theoretic result was proved by Sam [38] showing that the ideal of th secant variety of th Veronese embeddings is generated in bounded degree that is independent of . Interestingly, the ideal-theoretic generation in bounded degree for secants of Segre varieties and Grassmannians are still central open problems. Finiteness issues are strongly connected with the theory of twisted commutative algebras and -modules by Sam and Snowden [39], and the theory of noetherianity by Draisma and Kuttler [19], Hillar and Sullivant [27], and others.
Apart from beautiful results of existence, that are quite often non-constructive or very far from optimal, it is of interest finding an explicit description of phylogenetic algebraic varieties. One of the most well-known examples is the salmon conjecture [1], since the prize offered by Allman for the hypothetical solver would be a smoked Copper river salmon. It asks for the description of , the algebraic variety representing the general Markov model for and . The generators of the ideal are still unknown, however a set-theoretic description was found by Friedland and Gross [24]. More recently, Daleo and Hauenstein [13] gave a numerical proof of the salmon conjecture.
As far as we know, our result is the only ideal-theoretic description, apart from the Jukes-Cantor model, where and is an arbitrary tree.
Plan of the article.
The whole article is devoted solely to the proof of the main theorem. In Section 2 we introduce the notation that is used throughout the proof. As the proof consists of several parts, some of them very technical, we present the overview of its structure in Section 3.1. The main result is established in Sections 3.2 and 3.3.
2 Preliminaries and notation
In this section we collect all the notation and terminology we will use in the rest of the paper. We divide this section into paragraphs to facilitate the reading.
**Groups.
**Henceforth we set , unless otherwise stated. We denote the elements of by , and . To denote unknown elements of , we use letters We also refer to an unknown element, that is not relevant in a specific argument, with question mark “?”.
Apart from , the most natural groups that enter the picture are the symmetric group on leaves , the group of flows , and the automorphism group . The group of flows is the following.
Definition 2.1** **(Group of flows).
Let be a abelian group and . The set of flows of length of forms a group under the componentwise group operation. It is non-canonically isomorphic to the group , the direct product of copies of .
The automorphism group of , , is the group of bijective group homomorphisms from to itself. The automorphism of specified by is simply denoted by ; similarly for all the other automorphisms of having a non-trivial fixed element.
**The toric variety .
**For any abelian group , the variety is a projective toric variety of dimension living in , where the projective coordinates are in bijection with flows [33].
Let us recall here its corresponding polytope. Let be the lattice whose basis corresponds to the elements of . Consider with the basis indexed by pairs . We define a map of sets from the group of flows to the lattice, , by . The vertices of the polytope of are the images of the flows under the injective map .
Remark 2.2**.**
The family of varieties has a wealth of symmetries; the group , the group of flows , and the automorphism group all act on the ideals of these varieties.
**Binomials, tables, and moves. **
Ideals of toric varieties are binomial prime ideals. Thus they admit a minimal generating set of binomials. Binomials may be identified with a pair of tables of the same size, and , of elements of , regarded up to row permutation; this is another natural group in this setting which we implicitly take into account. Indeed, a binomial is a pair of monomials and the variables correspond to rows. Given the number of leaves , coordinates are in bijection with flows of length of . Hence rows are identified with flows of elements in . Columns are in bijection with the leaves. From the definition of the toric ideals [41], it follows that a binomial belongs to if and only if the two tables representing it are compatible, i.e., for each , the th column of and the th column of are equal as multisets. We index the columns of a given pair of tables , with columns, by integers . We refer to the element in the th column of row as .
Let be any table of elements of . The procedure consisting of selecting a subset of rows in of cardinality at most , and replacing it with a compatible set of rows is a move of degree . A binomial, represented by a pair of tables of elements of , is generated by binomials of degree at most if and only if there exists a finite sequence of moves of degree applied to or that transform into .
Example 2.3**.**
Let be the table
[TABLE]
The table can be transformed by a move of degree three into the table
[TABLE]
Indeed, the set of the first three rows of is compatible with the set of the first three rows of . Note that if the rows in are flows, then the rows of are flows as well. The move described above is denoted by
[TABLE]
Remark 2.4**.**
In the notation for moves, we do not use the indices of the columns involved in the move. Instead, the indices are always clear from the move itself. For instance, the move in Example 2.3 is in columns . Also, note that, in general, the columns used for a move do not need to be consecutive.
Remark 2.5**.**
The groups , the group of flows , and the automorphism group act on the equations of , and hence on the tables. The group acts permuting the columns of the pair of tables corresponding to a binomial in the ideal of the variety. The groups and act on the entries of the tables in the natural way, i.e., by evaluation.
We now introduce one of the most important concepts for our approach. Given a pair of flows, we define a distance between them, which will enable us to use an inductive procedure on tables. The distance we consider is the classical Hamming distance between two words.
Definition 2.6** **(Hamming distance).
Let and be two flows in :
[TABLE]
Let and . The multiset constitutes the * disagreement string of the pair of flows and . The cardinality is the Hamming distance between and . The multiset constitutes their agreement string. Up to the action of the group of flows on both flows, we may assume that the group elements for all .*
Remark 2.7** **(Tables and Hamming distance).
Given a pair of tables , we “compare” them using the notion of Hamming distance as follows. Since the tables come with undistinguishable rows, we may choose as first rows of and two rows that minimize the Hamming distance among all the pairs of rows from and . After fixing the first row in and in , as described in Section 3.1, one of the techniques adopted in Sections 3.2 and 3.3 is as follows. With moves of degree at most four, we create another pair of rows with strictly smaller Hamming distance than the initial one.
Counting functions.
We will make use of counting functions on the tables and . A counting function on the columns of has the same values as counting function on the columns of , since the pairs of tables we are interested in are compatible, i.e., columnwise they are the same as multisets. Given , we denote by the number of copies appearing in the columns in , or in .
Example 2.8**.**
The function counts the number of copies of in columns and minus two times the number of copies of [math] in column .
From an algebraic point of view, a counting function defines a grading of the variables, that is a specialization of the multi-grading. Thus the fact that the counting function gives the same value on two tables is equivalent to the fact that the two corresponding monomials have the same degree with respect to the induced grading. Additionally, from the perspective of toric geometry, the counting function is induced by restricting the torus action to a special one-parameter subgroup.
Group homomorphisms.
We will make use of group homomorphisms in order to do counting arguments in a given pair of tables. We denote
[TABLE]
the group homomorphism given by the quotient map sending each element to its class modulo the subgroup generated by the element .
3 Complexity of the Kimura -parameter model
The aim of this section is to establish the phylogenetic complexity of the Kimura -parameter model. In Section 3.1, we discuss the structure of the proof, postponing the technical part of it to Sections 3.2 and 3.3.
3.1 Main result and structure of the proof
We proceed presenting our main result along with the outline of the plan of the proof strategy.
Theorem 3.1**.**
The phylogenetic complexity of the Kimura -parameter model equals four.
The structure of the proof is presented in Figure 1. Our proof is an induction on the number of leaves , i.e., the number of columns of the tables. The base of our induction is . The case of leaves has been studied computationally. More precisely, for the result is presented in [41] and for it is computed in [16]. For we used the program featured in [16] to produce the vertices of the polytope. The computer algebra program 4ti2 [47] specialized for toric ideals was able to compute the Markov basis using a server equipped with a CPU 4 Intel-Xeon E7-8837/32 cores/2.67GHz and a memory of 1024Gb RAM.
Proposition 3.2**.**
The ideal is minimally generated by polynomials of degree at most four: quadrics, cubics, and quartics.
The case is treated in Section 3.3.3. Methods similar to the general case and bounds on Castelnuovo-Mumford regularity obtained using Normaliz [7] allow us to reduce the problem to a computation handled with 4ti2. From the computational point of view, it is interesting to note that we were not able to address the case only with computational tools. Based on our experiments with 4ti2, we expect the computation to be not feasible: it would run for several years on a server of the same capability as the one mentioned above, and a memory of 1Tb RAM would not be sufficient to finish the computation.
For , we have an induction on the degree of the generators, i.e., the number of rows of the table. Inside a specific degree , we have an induction on the Hamming distance of two rows of the tables. The strategy in this inner induction on the Hamming distance is the following. Suppose we have a binomial generator of degree . Hence, we have a pair of tables consisting of rows each and with columns. Two rows have Hamming distance and we reduce it to ; in other words, the given pair of tables is transformed into a pair of tables that have an identical row. This is a binomial which is a product of a binomial of degree and a variable. By induction on , such a binomial can be generated in degree at most .
Hence the aim of the induction on the Hamming distance is to reduce it to . In order to achieve this, we address the case into two separate propositions in Section 3.2; see Proposition 3.5 and Corollary 3.6, and Proposition 3.12. This reduces the proof to . Recall that there do not exist flows whose Hamming distance is , since they cannot disagree only in one entry.
We now discuss the strategy in case , the technical heart of the proof, which is tackled in Section 3.3. In spite of many symmetries, discussed in Section 2, there are several cases one has to consider: We identify ten cases, indexed by roman numerals, where the first two rows of the given pair of tables have a disagreement string of length . Here we provide a uniform proof for three crucial cases: Case I, II, and III. As we show them simultaneously with the very same techniques, we refer to those as the “main case”. The rest of the cases is treated by reducing them to the main case.
For the proof in the main case, we look at the second rows of each of the tables and . Let denote the length of the disagreement string between those two, in columns not involving the first two. By Corollary 3.6, we are able to assume and, since , the length of the agreement string between the second row of and the second row of , outside columns and , is at least . Since the columns are indistinguishable up to the action of , we may assume that the columns and are involved in the agreement string. Now the aim is to reduce to the situation in which no row has two nonzero entries in the columns and : employing moves of degree at most four, we would like to eliminate all the strings which have nonzero entries on both columns and . We call such strings bad pairs.
Definition 3.3** **(Bad pairs).
A bad pair is a string , where the elements are such that:
- (i)
they are both nonzero; 2. (ii)
* is in column and is in column .*
We now show that eliminating all the bad pairs we fall back to the case of leaves, which allows us to conclude, by the outermost induction.
Theorem 3.4**.**
Suppose that a pair of compatible tables with columns do not contain rows with bad pairs. Then the corresponding binomial is generated in degree at most .
Proof.
The assumption implies that for every row of and we have either or . Summing up the columns and , we obtain two tables and . The crucial observation is that and are compatible tables with columns. Hence they correspond to a binomial in . This binomial is generated in degree at most by definition. This implies that and can be transformed into each other by a finite sequence of moves of degree at most . Each of these moves lifts to the tables and , transforming all their columns accordingly, except columns and . Here the moves permute the pairs of elements, where each pair is formed by the two elements in columns and , in a fixed row. These moves transform into . The latter need not be the same though; indeed, they may differ in columns and . As in the proof of [35, Theorem 3.12], we make quadratic moves to adjust the elements in columns and . These transform into . Hence the tables are generated in degree at most . ∎
3.2 Reduction of Hamming distance 3
In this section, we start our reduction of the Hamming distance. More precisely, we assume the Hamming distance to be at least three and we prove that we can reduce it to two; the latter will be discussed in Section 3.3. We proceed analyzing the cases when the disagreement string is given by at least four entries.
Proposition 3.5**.**
The disagreement strings (i) , (ii) , (iii) , and (iv) can be reduced.
Proof.
(i). Consider the function . By the action of the group of flows , we may assume that this counting function is nonpositive on both of the tables. Since the function is stricly positive in the first row of , there exists a row in where there are strictly more copies of than copies of [math] in the columns . On the other hand, cannot contain in two of the columns , since we would exchange those with the corresponding entries in the first row and this would decrease the Hamming distance. Thus has one copy of and no copies of [math] in columns . If the row has both copies of and , we would move the string to the first row of , reducing the Hamming distance. Whence we may assume that contains the string in columns . Notice that in columns of , there are no strings of the form or , otherwise quadratic moves would decrease the Hamming distance. Additionally, in columns there is no string of the form ; for this we can apply in the cubic move . Now, we introduce the counting function on . By the previous discussion about the possible strings in columns , this function is at least one in every row of . Consequently, there exists a row in where this function is three. As a consequence, the row contains either the string or . This would decrease the Hamming distance.
(ii). Consider the counting function . By the action of the group of flows , we may assume it is nonpositive on both of the tables. Since this function is strictly positive on the first row of , there exists a row in where the function is strictly negative. Note that on the row , one has ; otherwise we would make a quadratic move, involving and the first row of , reducing the Hamming distance.
If in the row we have , then , by the value of the counting function on . Hence in the row , there exists , which allows us to make a quadratic move reducing the Hamming distance. Without loss of generality, we have , and . Thus the row contains either the string or the string . In both cases, we exchange with the first row of and we act with the flow on producing , which is (i).
(iii). Consider the function . By the action of the group of flows , we may assume it is nonpositive on both of the tables. Therefore there exists a row in where the function is strictly positive. Note that on the row one has .
If in the row we have and , then we may assume contains the string in columns . We have , as otherwise in each of these circumstances we would make a quadratic move between and the first row of , reducing the Hamming distance. Then the function is zero on , which is not possible by assumption. Analogously, we may conclude when and .
If in the row we have , and , then . In this case we have , because of a quadratic move between and the first row of . Hence the row contains the string in columns , which again would reduce the Hamming distance.
If in the row we have , then either or . If , then in columns the row contains the string ; indeed we cannot have copies of or by quadratic moves with the first row of . This implies that the counting function is zero on the row , which is not possible by the assumption. If in the row we have and , then . In the row we can now exclude all the possible elements in each column by quadratic moves, obtaining the string in columns . We exchange this string with the first row of , reducing the Hamming distance. Analogously, if in the row we have and , we obtain in columns , and we conclude in the same way.
(iv). Consider the counting function . By the action of the group of flows , we may assume it is nonpositive on the tables. Therefore there exists a row in where the function is strictly negative. Thus on the row we have , as .
Suppose that in the row we have . Then and , by the assumption on the value of the counting function on . In two of the columns we cannot have or by quadratic moves, involving and the first row of . Thus we have a copy of ; we now make a quadratic move between and the first row of , which decrease the Hamming distance.
Suppose that in the row we have . If in the row we have , then . In columns we cannot have , as otherwise we would exchange the string with the first row of , thus reducing the Hamming distance. Whence contains the string in columns . If in the row we have , then . In this situation, by the same argument, contains the string (or or ). We claim that having the string can be reduced to the case of having the string up to quadratic moves and group automorphism. Indeed, suppose we have the string in the row . We exchange from with from the first row of in columns . We act with the flow on both tables and we transpose column and column . Now the row contains the string in columns .
By the previous discussion, it is enough to deal only with the string in . Consider the counting function . Note that this function has only odd values. We now show that the function cannot be positive on a row of . Indeed, assume there is a row where the function takes a positive value. Then the row contains either , or in columns . The first two cases are not possible, because we would exchange them with the string in the row ; this would produce or in the row , which we would exchange with in the first row of . We are left with the possibility of having in columns . For this we apply in the cubic move .
In conclusion, the counting function is strictly negative on every row of . Since the value of this function on the first row of is , there exists a row in on which the function is . Thus in we have either or in columns . In this case, we would exchange them with the first row of reducing the Hamming distance. ∎
Corollary 3.6**.**
Suppose that a table contains two rows and having disagreement string of cardinality four. Then, using moves if degree at most three, can be transformed in such a way that the disagreement string has cardinality at most three. Moreover, only the four columns of the disagreement string are involved in the reduction.
Proof.
Assume two rows and do not agree on four elements. Up to the action of the group of flows and , the elements of in the disagreement string can be set to be ; all the possibilities for the elements of in the disagreement string are , , , and . By Proposition 3.5, these disagreement strings can be reduced. Hence, performing the moves in the proof of the Proposition 3.5, we transform the tables in such a way that the cardinality of the disagreement string is at most three. ∎
Now we deal with the disagreement string of length three, . We begin with preparatory lemmas.
Lemma 3.7**.**
Suppose that the disagreement string between and is , in columns . Then we may assume that there exists a row in containing the string in columns .
Proof.
We introduce the counting function . By the action of the group of flows , we may assume that the sum is nonnegative on . Then there exists a row in where the function is strictly positive.
If in the row we have , then , by the assumption on the counting function evaluated at . By the action of the group of flows , we may assume without loss of generality that contains the string in columns . Then by assumption. Also, , as otherwise we would exchange the string with in the first row of , reducing the Hamming distance between and . Hence . Similarly, and , as otherwise we exchange with in the first row of . Hence contains the string in columns , which we exchange with the first row of . ∎
Lemma 3.8**.**
We may assume that the row of Lemma 3.7 in contains the string in columns . More generally, for every row containing the string in columns , the nonzero element of in columns coincides with the corresponding entry of the first row of .
Proof.
The row contains a string with exactly two elements equal to [math] in the columns and . By the action of , we may assume that contains the string in columns . Note that , as in both cases we make a quadratic move between and the first row of , reducing the Hamming distance between and . Thus . By the action of the group of flows , in every row containing the string in columns , the nonzero entry coincides with the corresponding entry of the first row of . ∎
Lemma 3.9**.**
Suppose that in we have a row containing . Then this is the only string that a row with in columns may contain.
Proof.
Since the row in contains , then it cannot contain another copy of , as we would exchange with the first row of , thus reducing the Hamming distance. Hence contains , since it is a flow. Assume there exists another row containing a string with , different from . By Lemma 3.8, the unique nonzero entry in columns of agrees with the corresponding entry of the first row of . Assume that contains in columns . Then we apply the cubic move , reducing the Hamming distance. For a row containing we conclude in the same way. ∎
Lemma 3.10**.**
As in the proof of Lemma 3.9, we assume that contains in columns . There exists a row in such that and, moreover, contains the string in columns .
Proof.
Such a row exists in by the compatibility of the two tables. The structure of is:
[TABLE]
By Lemma 3.9, we have . Analogously, we have by applying Lemma 3.9, upon exchanging the string in the first row with in the second row.
Note that and , as otherwise, exchanging with the first row, in the first case with and in the second with , we would reduce the Hamming distance; analogously, and . Furthermore, by Lemma 3.8, we have as otherwise we would create the string and respectively. Analogously . Hence the only remaining possibility is and . ∎
Lemma 3.11**.**
The counting function is at most on every row of .
Proof.
For the sake of contradiction, suppose there exists a row in , where the counting function is nonnegative. In , there exists a row with . By Lemma 3.10, the row contains the string .
If in the row we have , then , again, by Lemma 3.10. Hence we have at least two differences with and we can make a quadratic move between and . This reduces the Hamming distance. Thus on the row one has .
If on , we have the following possibilities:
- (i)
contains ; 2. (ii)
contains ; 3. (iii)
contains ; 4. (iv)
contains .
In case (i), we have by the assumption on the value of the counting function. Additionally, , as we would exchange the string with the first row in . Consider the differences between and . If , then we can make a move involving column , at most one of columns and either column or between and . This allows us to exchange in with [math] in ; this contradicts Lemma 3.10. Hence , which on the other hand contradicts the nonnegativity of the counting function. Exchanging , the row appearing in Lemma 3.10 containing , with the first row of , case (ii) is the same as case (i).
In case (iii), by the assumption on the value of the counting function. Moreover, since we would exchange in with the string in in columns , contradicting Lemma 3.8. We also have , because we could make a quadratic move in columns between in with in , obtaining the string in . Now, we exchange in columns , the string in with in , which produces the string ; this reduces the Hamming distance. Finally, if , we exchange in columns , the string in with in , obtaining , which again reduces the Hamming distance.
In case (iv), by the assumption on the value of the counting function. Additionally, , because otherwise we would exchange in columns the string in with in , thus contradicting Lemma 3.10. Also, , as we would make a quadratic move on columns between and , contradicting again Lemma 3.10. Analogously, . Hence contains the string . We exchange in columns the string in with in , which produces in , which in turn implies by Lemma 3.8. This contradicts the nonnegativity of the counting function.
If , by symmetry, we may assume or . If , then by Lemma 3.10, contains , which contradicts the nonnegativity of the counting function. If , then contains . Then by the assumption. Moreover, , as we would exchange with in columns , contradicting Lemma 3.10.
If , we now consider the value of . We have by assumption. We have by assumption on the nonnegativity of the counting function. Moreover, , since otherwise we would exchange in columns the string in with in contradicting Lemma 3.10. Hence , i.e., contains the string . Now, , by the assumption on the value of . Moreover , by the assumption on the value of the counting function on . Also notice that , as otherwise we exchange in columns the string of with of the first row of , and then we exchange from the first row with in reducing the Hamming distance. Therefore contains the string , which we exchange with the string in in columns and , contradicting Lemma 3.10.
If , then contains . Furthermore, by assumption on the value of . Moreover, exchanging in columns the string of with of , contradicting Lemma 3.10. Analogously, we would contradict Lemma 3.10 for , exchanging in columns , the string in with in . Hence contains the string . Here , by assumption. Moreover, , because of the nonnegativity of the counting function. Also, , because we would contradict Lemma 3.10, exchanging of with of . Therefore contains , but we exchange it with in columns contradicting Lemma 3.10.
If , then , by the assumption on the nonnegativity of the function on . Thus contains different from in columns . Hence we have two identical differences between and , which allow to make a quadratic move, contradicting Lemma 3.10. ∎
Proposition 3.12**.**
The disagreement string can be reduced.
Proof.
By Lemma 3.11, the counting function is at most on every row of . As a consequence, there exists a row in , where the function is at most . By the value of the counting function on the row , the entries in must agree in two, three, four or five entries with .
If agrees in five entries, it contains . We exchange with in the first row of , which reduces the Hamming distance between and . If agrees in four entries, we denote by the element where does not agree with . If , then we would have either the string or , which is also in table ; this reduces the Hamming distance. Suppose contains . If , the table contains the same flow. If or , we exchange or with in the first row of .
If agrees with in three entries, we denote by the remaining two. First, note that if are in columns or in columns , we exchange or with in the first row of ; this decreases the Hamming distance.
Assume that both of and are in columns . If contains , then , because otherwise we would exchange the string or with the first row of reducing the Hamming distance. Whence . Moreover , by definition. Additionally, , because we would move to the first row of , reducing the Hamming distance. It follows that . On the other hand, , since the counting function is at most on . Furthermore, , as otherwise we would exchange with in the first row of , reducing the Hamming distance between and . Hence contains either or . For the first, we exchange in columns , the string with in the first row of , and we exchange in with the first row of . For the second, we exchange with the first row of and in with the first row of , which reduces the Hamming distance.
If contains , then applying the automorphism and a transposition between columns and , we are in the case when the row contains .
If are both in columns , we apply analogous moves as the ones featured above. Then we may assume that is either in column or , and is either in column or . In all these cases, we have and , as all the other possibilities are excluded by exchanging with the first row of . The fact that contradicts the value of the counting function on .
If agrees with in two entries, we have on , since the value of the counting function on is at most . In columns , there is at least one entry which does not agree with the corresponding entry in , because otherwise we would move to the first row of , reducing the Hamming distance. Denoting the elements where they do not agree by , the strings that may contain are: , , and . Note that these are all the possible, as the remaining ones are resolved in the same way upon exchanging the string in the first row with in the second row of . If contains , then we exchange the string of in columns with in . We now exchange the string of with the first row in ; these two rows have lower Hamming distance. If contains in columns , then , by the counting function. Moreover, since . Hence or . Now we exchange or with in the first row of reducing the Hamming distance. If contains , by definition or by quadratic moves we can exclude the cases , and . Hence contains , which we exchange with the first row of , decreasing the Hamming distance. ∎
The preceding results of this section show the following corollary.
Corollary 3.13**.**
The Hamming distance of two flows can be reduced to at most two.
3.3 The disagreement string
In this section, we proceed in the case of the disagreement string .
[TABLE]
Let us denote the row in starting with the string by and the row in starting with the string by . After fixing the first rows and the first two columns, we make moves of degree at most four on the rest of tables in such a way that the number of agreements in and is maximized.
Remark 3.14**.**
Corollary 3.6 ensures that, after possibly making moves of degree at most four, the rows and in and respectively, agree in at least entries. Up to the action of on the leaves, and hence on the columns, these are the last columns.
Definition 3.15**.**
The string in the last columns of the rows and is the the agreement string between and . Up to the action of the group of flows , these entries are zeros.
Our aim is to prove the following three crucial cases, which we refer to as the main case:
[TABLE]
In Section 3.3.1, we reduce any other possible case to one of the above.
3.3.1 Reduction to the main case
Up to the action of the group of flows , there are at least as many copies of [math] as copies of in the first two columns of . Up to the action of , we may assume . We will show that all cases can be resolved, by reducing to the main case ( ‣ 3.3).
We first collect a useful lemma which we will use to resolve easily some of the cases.
Lemma 3.16**.**
If in table in (1) we have , then the corresponding cases can be reduced. If in table in (1) we have , then the corresponding cases can be reduced.
Proof.
If , then in we have either the cubic move or . The second sentence is the symmetric version of the first: acting with the flow on the tables, we produce the same tables as in the first statement. ∎
We now analyze all the possible cases. We refer to the tables and in (1).
Case . In this case, the table has the form:
[TABLE]
We may have .
.
Here, is reduced by Lemma 3.16. Hence we have (Case I) or (Case II).
.
Here, (Case X), (Case VII), (Case VI).
.
Here, (Case IV), (Case V), is resolved by Lemma 3.16.
Case . In this case, the table has the form:
[TABLE]
We may have .
.
Here, (which is Case II by acting with the flow and ), (Case III), resolved by Lemma 3.16.
.
Here, (Case IX), (which is Case II by acting with the flow , transposing and ), (which is Case V by acting the flow and transposition).
.
Here, (which is Case V by acting with the flow ), (Case VIII).
We now reduce all the cases to the main case ( ‣ 3.3), postponing its proof for the moment, as this requires more technical results.
Cases IV and V.
In this case we have:
[TABLE]
We may assume we do not have strings in columns of ; this is shown by the same arguments in the proof of Lemma 3.22. Hence the counting function is nonnegative on every row of . On the other hand, in the table , in columns we do not have the string , as we would reduce this case with a cubic move. In the same columns of , the string would decrease the Hamming distance. Moreover, the string is reduced by the cubic move , and is reduced by the cubic move . This is a contradiction and thus it shows the reduction.
**Case VI.
**In this case we have:
[TABLE]
In columns in , the string is resolved by Lemma 3.16. The string in columns of is Case V. Since we cannot have the string in columns of , the counting function is nonpositive in every row of . Thus there exists a row in with and . Hence . Acting by the flow and transposition we reduce to Case V.
**Case VII.
**In this case we have:
[TABLE]
We exclude the string in columns in , since it is Case II. We also exclude by Lemma 3.16. As in Case VI, there exists a row in such that and . Now, by acting with the flow , making a transposition and applying the group automorphism , we reduce to Case II.
Case VIII.
In this case we have:
[TABLE]
We may exclude in columns in the string . Also, we exclude the string by the quartic move . Moreover, in columns in , notice that we can exclude the strings and by Lemma 3.16. Hence the counting function is nonnegative on every row of . On the other hand, in we may reduce the string , and by Lemma 3.16. Finally, we are able to reduce the string by the quartic move . This is a contradiction and thus it shows the reduction.
Case IX.
In this case we have:
[TABLE]
Analogously to the proof of Lemma 3.22, we exclude in columns of . So the counting function is nonnegative on every row of . On the other hand, in columns of , the strings correspond to the case for and in tables (1), which were previously done. Thus there exists a row such that and by the positivity of the counting function in and . Exchanging the string in the first row with the string in , acting by on both and , applying the automorphisms and we obtain Case III.
Case X.
In this case we have:
[TABLE]
In , in columns we can exclude , because it is Case I. The string reduces to Case IV. As usual, the string is excluded. Hence the counting function is nonpositive in every row of . Hence there exists a row in such that and . The possible values of are either or , since for we have an immediate reduction. For we apply Lemma 3.16 and is Case IX.
3.3.2 Preliminary Lemmas
We are now ready to present our preliminary lemmas, that are devised to tackle the main case ( ‣ 3.3). As they will be used very often, we give them specific reference names in order to facilitate the reading.
Lemma 3.17** **(Difference Lemma).
Suppose we have the table whose first three rows are :
[TABLE]
where and . If one of the following holds:
- (i)
* and is or for some ; or* 2. (ii)
, and is or for some ,
then we can transform the row to a row starting with the string .
Proof.
When the difference in both (i) and (ii), we make the quadratic move , which exchanges the corresponding entries in rows and , thus creating a row starting with the string . Analogously for the case (i), when the difference is . In (ii), when the difference is , we make the cubic move . ∎
Remark 3.18**.**
Note that the Difference Lemma 3.17 distinguishes one group element in each table in each of the crucial cases Case I, Case II, and Case III. In all the cases, these are in and in . In particular, if the second and third row differ on some index , then their difference must be equal to the distinguished element.
Although basic, the Difference Lemma 3.17 will be used very frequently. We apply it following the observation above. Indeed, our aim will be often to produce a row starting with a string of type and conclude by induction. To this end, after identifying the situation described in Lemma 3.17, if , then we will be able to immediately infer what can be the element ; to exclude all the other possible values we apply the Difference Lemma 3.17, obtaining a row starting with the string . This will be useful to decrease the given Hamming distance and conclude by induction on the degree.
Lemma 3.19** **(Standard Lemma).
Let be a table and suppose there is an element in some row with and . Suppose there is a row of with and , where , and a row with the element in the same column as . Then we can exchange and (and appropriate entries in columns and ). The same statement holds when is a string of elements of .
Proof.
Let us consider the entries and . If or , then we make a quadratic move putting and in the row . If or , then we move the string to the row , and finally we exchange with [math] and with . If are equal, then we move the string in the row exchanging it with the string , thus we exchange with and [math] with . Hence, we may assume that and they are both different from [math] and . Hence, the sum . Thus, we may exchange with and with . The last statement is shown using the same arguments. This completes the proof. ∎
We now record some technical results on the main case ( ‣ 3.3). Note that in the main case we have .
Lemma 3.20**.**
If for some then we may assume that both are equal to . In particular, both rows have [math] on the agreement string.
Proof.
Without loss of generality, let us assume . If , then a quadratic move allows us to produce the string in both tables. If , then in both tables we obtain the string by quadratic moves again. If , then in both tables we obtain . The last string is obtained in by quadratic moves, and in by the following moves:
- (i)
in Case I and II, by the cubic move ; 2. (ii)
in Case III, by two quadratic moves, upon exchanging with .
∎
Remark 3.21**.**
We observe that in Case I and Case III, the tables and are in “symmetry”. More precisely, the fixed entries in table can be obtained from the ones in , by acting with the flow and applying the automorphism of , that exchanges and . In particular, if we can prove a statement for then a “symmetric” statement holds for .
Lemma 3.22**.**
We may assume that no row in contains in columns any string of the form . Analogously, no row in contains in columns any of the strings of the form .
Proof.
In all the cases, one can obtain either in or in . For , these are: , , . The statement for readily follows by Remark 3.21. ∎
Lemma 3.23**.**
For any row in differing from on some column index not by , we may assume or . Analogously, in , if differs from on some column index not by , then or .
Proof.
In Case I and Case II, by the Difference Lemma 3.17, and a quadratic move with or with in , we may assume or . The result follows by Lemma 3.22.
In Case III we exclude . Indeed, if or , then we are in Case II (more precisely, for we also need to exchange the two columns to reduce to Case II). If or , by the quadratic moves or we produce or and apply Lemma 3.16. Remark 3.21 gives the symmetric statement for . ∎
Lemma 3.24**.**
If there exists and index such that in , then we may assume that in . Analogously, if there exists an index such that in , then we may assume that .
Proof.
Assume . Suppose or in . Then there exists a row in with . The row contains the string in columns for some . Let us determine the possible values of . If we would have the string in both of the tables. By Lemma 3.23, . Whence the counting function is nonpositive on every row of . It follows that in there exists a row with and . By Lemma 3.23, we have . For , by quadratic moves, we obtain in both tables. Now consider the case . In for every row with and (likewise the above), we have . Indeed is either [math] or by Lemma 3.23. On the other hand, because otherwise we would produce the string in , which is also in . Hence . Since in we have the row with , there exists a row in with . If we are done, as we produce in both tables. For we have the cubic move in , . For , we have the quartic move in , . For , we have the quartic move in , . ∎
3.3.3 The case of leaves
After having set up the cornerstone of our approach, we are ready to first establish the case of leaves. Let be the lattice polytope of the Kimura -parameter model for leaves. Here we are in the setting of polytopes. To be consistent with standard terminology, binomials in the ideal of the Kimura -parameter model are identified with relations among lattice points, which in turn are naturally identified with variables. The minimal generating relations among the vertices of the polytope constitute a Markov basis. The degree of an element of a Markov basis is the total degree of the corresponding binomial in the standard grading. The degree of the corresponding table is the number of rows. Only in this section, given a Markov basis element , which we think of as a binomial, we introduce the notation to denote its degree.
As recalled in Section 2, the polytope is dimensional. Following the notation of Section 2, a generating set of the full lattice is . However, our lattice is a sublattice of . Since we have the six linear relations for satisfied by the vertices of the polytope, we can choose the elements for to serve as a basis of the -dimensional lattice of interest.
Proposition 3.25**.**
The polytope defines an dimensional projectively normal (in particular, Cohen-Macaulay) toric variety in . Its Hilbert series is , where
[TABLE]
Its Hilbert polynomial is
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
*In particular, the Markov basis has elements of degree at most .
Let us consider the following two codimension two faces of :*
- (i)
* contains points corresponding to flows that have [math] or on the sixth leaf. This is the intersection of with the linear subspace .* 2. (ii)
* contains points corresponding to flows that do not have on the sixth leaf and on the fifth leaf. This is the intersection of with the linear subspace .*
The Hilbert series of (i) is , where
[TABLE]
The Hilbert series of (ii) is , where
[TABLE]
In particular, the Markov basis in both cases has elements of degree at most .
Proof.
The computation of Hilbert series and verification of normality were obtained using Normaliz [7]. The statements about the degree of Markov basis are a consequence of well-known theorems on regularity of normal toric varieties, see Appendix 4. ∎
Lemma 3.26**.**
The following three codimension three faces of have Markov basis with elements of degree at most four:
- (i)
* contains points corresponding to flows that have [math] on the sixth leaf. This is the intersection of with the linear subspace and is isomorphic to the Kimura -parameter model polytope for five leaves.* 2. (ii)
* contains points corresponding to flows that do not have or on the sixth leaf and do not have on the fifth leaf. This is the intersection of with the linear subspace .* 3. (iii)
* contains points corresponding to flows that do not have on the fourth, the fifth and the sixth leaf. This is the intersection of with the linear subspace .*
Proof.
We employed 4ti2 [47] to compute explicitly the Markov basis in all three cases. More specifically, for we obtained relations: quadrics, cubics, and quartics. For , we obtained relations: quadrics, cubics, and quartics. ∎
Remark 3.27**.**
The polytopes and are not isomorphic, although they have the same dimension. One can easily see that have vertices respectively. Similarly, and have and vertices respectively.
Let us consider a Markov basis element of . We show that one of the following holds:
- (i)
has either degree less than or equal to four; 2. (ii)
has , which is not possible by Proposition 3.25; 3. (iii)
is a Markov basis element of or of degree at least , which is not possible by Proposition 3.25; 4. (iv)
is a Markov basis element for a polytope isomorphic to or (in this case, it has degree at most four by Lemma 3.26).
Proposition 3.28**.**
Any Markov basis element for has degree at most four.
Proof.
It is enough to restrict to the main case ( ‣ 3.3). We first prove two claims, Claim (i) and (ii).
Claim (i): For any row of distinct from the first one, for any pair of indices , we have that either or . The analogous statement holds for , with the group homomorphism replaced by .
Proof of Claim (i).
Suppose the statement is not true for some pair of indices . If , then we can make a quadratic move on , and conclude using the Difference Lemma 3.17. Thus, without loss of generality, we may assume and . If there exists another index such that , then we can make a move on a subset of and, again, conclude by the means of the Difference Lemma 3.17. In conclusion, . As and are flows and , this contradicts Lemma 3.23, which prescribes the first two columns of a row differing not by with . ∎
By Proposition 3.5 and Lemma 3.20, we may assume that , as the disagreement string between the two rows has length at most three, outside the first two columns.
Claim (ii): There exists at most one index such that .
Proof of Claim (ii).
As the number of such indices must be odd it is enough to prove that not all are equal to or . Not all can be equal to since, by by Lemma 3.24, that would contradict the fact that is a flow. Say and . Then we have by Lemma 3.24 and thus . However, we may exclude by Lemma 3.20 and we may exclude by Lemma 3.24. ∎
To continue our proof, we need to introduce some terminology, which we will use only here. A column index is of type:
- (a)
if all elements of appear in the corresponding th column of (and of ); 2. (b)
if exactly three elements of appear in the th column; 3. (c)
if exactly two elements of appear in the th column; 4. (d)
if exactly one element of appears in the th column.
Step 0: We suppose that all columns are of type .
By Claim (ii), there exists one index such that . For , there must exist at least two rows such that and . Note that , are not the first row. Further, for the index there must exist one row different from the first one such that or . All these rows are distinct by Claim (i). Hence, we obtain seven rows; we call them difference rows for . Note that the difference rows for may only have and in columns by Lemma 3.23. Analogously, we obtain at least seven difference rows in , with copies of [math] or in columns and .
If there exist difference rows in and with in the first two columns, then we obtain the string in both tables and we conclude by induction on the degree of .
Thus suppose that there is no string in columns of . It follows that there must be at least seven copies of [math] in columns in the difference rows of . Consequently, there are at least nine copies of [math] in columns in . By Lemma 3.22, there is no string in columns in , and the difference rows for do not have copies of [math] in columns . In conclusion, we have at least this amount of distinct rows in :
- (i)
three, that are the first ones; 2. (ii)
seven, that are the difference rows; 3. (iii)
seven, that contain copies of [math] in column or ; 4. (iv)
two, that have in column or .
Then, we have . This is impossible for a Markov basis element by Proposition 3.25.
Step 1: We suppose that there exists exactly one column of type and all others are of type . We may proceed as before, however we obtain only six difference rows in the case when the column of type has column index . In the case when the column index of the column of type is either or , we obtain seven difference rows, but we cannot assume that there exists an additional row with in the same column index of the column of type . In either of these cases, we have , that contradicts Proposition 3.25.
Step 2: We suppose that there exist exactly two columns of type (resp. one column of type ). Here, we obtain five difference rows. However, represents a Markov element for (resp. ), whose ideals have regularity ; see Appendix 4 for the definition of the associated ideal. We obtain the bound which contradicts Proposition 3.25.
Step 3: We suppose there exist either:
- (i)
three columns of type ), or 2. (ii)
one column of type or and one column of type , or 3. (iii)
one column of type .
In such cases we conclude by Lemma 3.26. ∎
3.3.4 Proof of the main case
In this last part, we finish our proof dealing with the main case ( ‣ 3.3). This will be done uniformly, i.e., with the same arguments in all the three instances of the main case and only technical details differ. Here the number of leaves is . The outline is as follows:
- (i)
We show that, if , then we have ; 2. (ii)
Among the pairs of tables we consider (tables where we have fixed the first two entries of the rows and and performed moves of degree at most four so that and have the agreement string as large as possible) using at most moves of degree four, we attain the situation where the number of bad pairs, i.e., strings , with , in columns and is as small as possible; 3. (iii)
We show that we can kill all the bad pairs, i.e., we can make moves of degree at most four killing all of them. Summing up the two columns indexed by and allows us to conclude by induction on the number of leaves ; see Theorem 3.4.
We are now ready to establish the main case in the following lemmas.
Lemma 3.29**.**
We may assume that no rows in has the string or in columns and . Analogously, no row in has the string or in columns and .
Proof.
In such a case we make a quadratic move in columns and and we conclude by applying the Difference Lemma 3.17. ∎
Lemma 3.30**.**
We may assume that no row in has the string or in columns and . Analogously, no row in has the string or in columns and .
Proof.
Let be such a row with such a string in columns and of . If for some other column index , we have then we may exchange and a nonempty subset of elements under the agreement string. Then we conclude by applying the Difference Lemma 3.17. As is a flow, we have . This contradicts Lemma 3.23. ∎
Lemma 3.31**.**
We may assume that under the agreement string no row in has . Analogously, no row in has .
Proof.
Let be a row in with under the agreement string. We first claim we may assume that does not have in any column. For the sake of contradiction, suppose for some column index . Whence, by Lemma 3.24, we have . By compatibility of the tables and , there exists a row in with . By the Standard Lemma 3.19, we can make a move to obtain and conclude by applying Lemma 3.20.
We divide the rest of the proof into two steps according to whether or not there exists in .
Step 1: Suppose there exists another in . The tables and are the following:
[TABLE]
By Lemma 3.23, the counting function is nonnegative on . Let be a row in , where the function is strictly positive. We now exclude the case . Indeed, in this case, if , then we obtain in both tables. If , by the positivity of the counting function on , we have and we may perform a quadratic move to obtain the string . Whence .
Let be a row in with . By the Standard Lemma 3.19, we can make a move between and involving this entry. In particular, if , then we make a quadratic move between and on first two entries and conclude by Lemma 3.20. Thus . Let be a row in with . We finish the proof of Step 1 by proving that we can always obtain in . First, suppose for or . Then we may exchange with on column indices and , obtaining a row such that and . Then we can make a quadratic move between and to obtain in both tables. Also, notice that as this immediately leads to in both tables. If we may exchange and on column indices and , obtaining in both tables. If , we can make a quadratic move between and . Similarly, if and we can make an exchange with . If and we first exchange it with on column indices , then we apply . Finally, if and , we apply the cubic move
[TABLE]
Step 2: Suppose there is no in ; without loss of generality we may assume we have and in columns . In column , in row of we cannot have by Lemma 3.20; moreover, we cannot have by the Standard Lemma 3.19 applied to table , as we would produce in the row , contradicting Lemma 3.20. Thus we have [math] in column in the row , since is excluded in row by the claim in the very first part of the proof. Since the disagreement string has length at most three by Corollary 3.6, we have the following tables and :
[TABLE]
Furthermore . By the disagreement string length, we have . Consider the group morphism and apply it to columns . Note that the evaluation of under in column indices is the vector . We claim that no row of can differ by more than one element with respect to in column indices . Indeed, suppose a row in differs on . Then . Thus, by the Standard Lemma 3.19, we can make a quadratic move on and conclude by Difference Lemma 3.17. By double counting, there must exist a row in such that and . By a quadratic move and the claim at the very first part of the proof, we may assume . Now we can make a quadratic move between and involving the entry in column and the entry in either column or . However, we may conclude as in the first part of Step 2. ∎
Lemma 3.32**.**
We may assume that under the agreement string no row in has or (resp. or ).
Proof.
Step 0: Assume that there exists in and in ; without loss of generality we may assume that they are in columns . In this case the tables are:
[TABLE]
Let be the row in that contains the string (resp. ) under the agreement string. By Lemma 3.20 and Lemma 3.23, we see that . Let be a row in such that . By Lemma 3.20, we can exclude under the agreement string. Furthermore, performing a quadratic move, we notice that if has under the agreements string, we could reduce (resp. ) to (resp. ), contradicting the minimality of the number of bad pairs. Also, cannot have or (resp. or ) under the agreement string, as we could exchange it with (resp. ) and conclude as before. Thus, under the agreement string, has either the string or (resp. or ). Now, Lemma 3.20 and Lemma 3.23 allow us to conclude that . Hence, the counting function is strictly positive in . Let be a row in such that and . By Lemma 3.20, we may exclude in column in . Consequently, by Lemma 3.23, has either [math] or in column . If , we obtain the same string in both tables. If , we obtain the string in ; we now show we may also obtain it in . We discuss this according to the three crucial cases:
- (i)
Case I and II: We apply the move , where is under the agreement string and . (resp. We consider the first two entries of , which by Lemma 3.23 could be: , , , . The last three allow to obtain in both tables. As must agree on all nonspecified entries with this contradicts the fact that is a flow.); 2. (ii)
Case III: We apply the move where . (resp. We proceed as before, noting that we do not use the third row, except for , in which case we obtain in both tables).
Step 1: Assume there exists in and no in . The tables are:
[TABLE]
As the disagreement string is of length at most three, we must have . Further, by Lemma 3.20 or . Consider the group morphism . We claim that after applying to column indices , no row can differ on more than one index from . Indeed, if a row differs on two indices , then, by the Difference Lemma 3.17, we may assume and . The rows and must differ by either in column index or , and by on the other. In particular, by reducing the number of bad pairs (resp. ) under the agreement string, we exclude the situation when has under the agreement string. By the Difference Lemma 3.17, we also know that does not appear in under the agreement string. In the same way, if or appears under the agreement string, we may exchange it along with the index or , again contradicting Difference Lemma 3.17. By double counting, there exists a row in such that . In particular, there exist two indices such that we can make a quadratic move between and . This either contradicts Lemma 3.24 or one decreases the Hamming distance.
Step 2: Assume there is no in and there exists in . The tables are:
[TABLE]
As before equals or [math]. Further . We apply to column indices .
We claim we may assume that no row in differs from on more than one index. For the sake of the contradiction, suppose there exists in differing on and . If , then we make a quadratic move between and on . If the difference equals , we conclude by the Difference Lemma 3.17. Thus we assume the difference equals . If we conclude by Lemma 3.20. Hence, ; on the other hand, this reduces the Hamming distance. Consequently we have and . Notice that we cannot have , thus at least one difference must be equal to . Hence, we exclude in under the agreement string, as then we could reduce the number of (resp. ) under the agreement string. Further, and also cannot appear under the agreement string, as otherwise we may conclude by the Difference Lemma 3.17. Whence has or under the agreement string. By Lemma 3.20, we have . Let be a row of with . As before, we conclude that has or under the agreement string (resp. or ), and , . We now exclude the case , i.e., . In such a case, we could exchange and on column and under the agreement string; then with on column indices and ; finally with on column indices and the last entry to conclude by Lemma 3.20. (Resp. We apply the relation on and the agreement string .)
In conclusion, our discussion leads to and . However, we may exchange with on and . Consequently we exchange with on and under the agreement string to conclude by Lemma 3.20. This concludes the verification of our claim.
By the claim, there must exist a row in , such that . On two of these indices, differs from by the same element: either or . We can make a quadratic move on these two column indices and conclude by Difference Lemma 3.17.
Step 3: Assume there is no in and no in . The tables are:
[TABLE]
Suppose . Let be a row of with . As in the previous steps, we may assume that has or (resp. or ) under the agreement string and for . By Lemma 3.23, we have or (resp. or ; we may obtain in both tables by the move: ). However, easily leads to in both tables by the cubic move in . Furthermore, we may assume that does not appear on column indices in any row in , otherwise we would exchange with obtaining in columns . It follows that is positive on . However, a positive row in contradicts Lemma 3.23.
Thus we may assume . Without loss of generality . We apply the homomorphism to column indices . We prove that no row may differ on two indices from in . This is analogous to Step 1. Whence there exists a row in , such that . We may assume , as otherwise we can make a quadratic move on column indices and conclude by previous steps. However, in such a case we may exchange with (on column index and on column index either or ), conclude by Lemma 3.20 or reduce to the first part of this step, where we assume . ∎
Lemma 3.33**.**
We may assume that under the agreement string no row in has or (resp. or ).
Proof.
Let us act on tables , by the flow and then apply the group automorphism . This translates Case I and Case III to Case III and Case I of Lemma 3.32 respectively; cf. Remark 3.21. However, Case II is not transformed to the previous cases, due to the rows in and in . We note that in Steps 1, 2, and 3 of Lemma 3.32 we are only using the rows in and in that still appear after translating Case II.
Thus, we only need to conclude in Case II and Step 0, i.e., there exists in and in . Without loss of generality, we may assume that they are in columns . The tables are:
[TABLE]
Let be the row in with a bad pair of the form (resp. ). First we exclude by quadratic exchange with , and the Difference Lemma 3.17 and Lemma 3.20. Let be the row in such that . We note that if then, exchanging with we could reduce the number of bad pairs. Moreover, by Lemma 3.20 we know that . Furthermore, as we already know that must be equal to zero, we have . Thus, we must have (resp. ). Note that gives in both tables, thus we may assume , by Lemma 3.23. Moreover, we have and hence the counting function is negative on . Let be the row in on which the function is negative, i.e., and . Now, , as otherwise we exchange and and conclude by Lemma 3.20. Thus, by Lemma 3.23, we have . If then we obtain in both tables, thus we may assume . We may obtain the flow in , by exchanging and , and in , by exchanging with . We finish the proof by showing that we may obtain the latter in , by the quadratic move . ∎
4 Appendix
We present known algebraic results for algebras over monoids that are cones over normal lattice polytopes. Much more information can be found in [5, 22, 36, 43, 46].
Let be a lattice and be a normal lattice polytope generating the ambient lattice. Let be the cone over . The cone , equipped with addition, has a natural structure of a graded monoid, with the grading induced by the first coordinate. The algebraic properties of the graded algebra are strongly related to combinatorial properties of .
Proposition 4.1**.**
The function defined by is a polynomial known as Ehrhart polynomial. For all , it coincides with the Hilbert function (and hence with the Hilbert polynomial) of the algebra . Moreover, it satisfies the Ehrhart reciprocity, i.e. for , where int denotes the interior points of the polytope.
We immediately see that the polynomial may agree with the Hilbert function even for negative . This happens if and only if , as the algebra is positively graded.
Definition 4.2** (-invariant, Hilbert regularity).**
The -invariant of an algebra is the largest integer such that the Hilbert function differs from the Hilbert polynomial. Hilbert regularity equals the -invariant plus one.
Corollary 4.3**.**
The -invariant of is always negative. It equals for the smallest such that contains an interior point.
Proposition 4.4**.**
If , then for some polynomial . The -invariant of equals .
We note that is the smallest dilation of that contains an interior lattice point.
Proposition 4.5** **(Hochster’s Theorem).
The algebra is Cohen-Macaulay.
Throughout the article we were interested in generators of the ideal such that . These are usually very hard to understand even for specific instances. However, there is an algebraic invariant that bounds their degree, known as Castelnuovo-Mumford regularity, or simply, the regularity.
Definition 4.6** **(Castelnuovo-Mumford regularity).
For an -module its regularity is defined as
[TABLE]
where
[TABLE]
is the minimal free resolution of .
As is an module, its regularity in particular bounds the degree of generators; this is the case in the definition. It can be seen that is the maximal degree of standard monomials under rev-lex in generic coordinates. Hence bounds the degree of such a Gröbner basis, as . The following proposition relates both notions of regularity introduced above.
Proposition 4.7**.**
* and equality holds if is Cohen-Macaulay. In particular, and is generated in degree at most .*
**Acknowledgements.
**Mateusz Michałek was supported by Polish National Science Centre grant no. 2015/19/D/ST1/01180, the Foundation for Polish Science (FNP) and is a member of AGATES group. The authors acknowledge the kind hospitality of UC Berkeley and FU Berlin, where this research was in part conducted.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Elizabeth S. Allman, Open Problem: Determine the Ideal Defining σ 4 ( ℙ 3 × ℙ 3 × ℙ 3 ) subscript 𝜎 4 superscript ℙ 3 superscript ℙ 3 superscript ℙ 3 \sigma_{4}(\mathbb{P}^{3}\times\mathbb{P}^{3}\times\mathbb{P}^{3}) , Available on-line (http://www.dms.uaf.edu/ ∼ similar-to \sim eallman/Papers/ salmon Prize.pdf), 2010.
- 2[2] Quentin Atkinson and Russell D. Gray, Curious Parallels and Curious Connections–Phylogenetic Thinking in Biology and Historical Linguistics , Systematic biology 54 (4) (2005): 513–526.
- 3[3] Adrian C. Barbrook et al., The Phylogeny of the Canterbury Tales , Nature 394 (6696)(1998), 839.
- 4[4] Louis J. Billera, Susan P. Holmes, and Karen Vogtmann, Geometry of the space of phylogenetic trees . Adv. in Appl. Math., 27 (4):733–767, 2001.
- 5[5] Winfried Bruns and Joseph Gubeladze, Polytopes, Rings, and K-Theory , Springer Monographs in Mathematics, Springer, 2009.
- 6[6] Weronika Buczyńska, Maria Donten-Bury, and Jarosław A. Wiśniewski, Isotropic models of evolution with symmetries , Contemporary Mathematics 496 (2009), 111–132.
- 7[7] Winfried Bruns, Richard Sieg, Tim Römer, and Christof Söger, Normaliz , http://www.home.uni-osnabrueck.de/wbruns/normaliz/ (2001).
- 8[8] Weronika Buczyńska and Jarosław A. Wiśniewski, On geometry of binary symmetric models of phylogenetic trees , J. Eur. Math. Soc. 9(3) (2007), 609–635.
