Normality of the Kimura 3-parameter model
Martin Vodi\v{c}ka

TL;DR
This paper proves that all algebraic varieties related to the Kimura 3-parameter model are projectively normal, confirming a key conjecture in algebraic statistics and advancing understanding of this fundamental phylogenetic model.
Contribution
It establishes the projective normality of algebraic varieties associated with the Kimura 3-parameter model, confirming Michalek's conjecture.
Findings
All algebraic varieties of the model are projectively normal.
Confirmed a significant conjecture in algebraic statistics.
Enhances mathematical understanding of the Kimura 3-parameter model.
Abstract
The Kimura 3-parameter model is one of the most fundamental phylogenetic models in algebraic statistics. We prove that all algebraic varieties associated to this model are projectively normal, confirming a conjecture of Michalek.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Normality of the Kimura 3-parameter model
Martin Vodička
Martin Vodička
Max Planck Institute for Mathematics in Sciences,
Inselstrasse 22,
041 03 Leipzig
Germany
Abstract.
The Kimura 3-parameter model is one of the most fundamental phylogenetic models in algebraic statistics. We prove that all algebraic varieties associated to this model are projectively normal, confirming a conjecture of Michałek.
1. Introduction
Phylogenetics is a science that models evolution. One of the central objects in phylogenetics is the tree model. In general, a statistical model is a parametric family of probability distributions. The tree model is based on rooted tree and finite set and gives us probability distribution on where is the number of leaves of the tree. The parameters are distribution on the root and transition matrices along the edges of the tree. A group-based model is a tree model where the set is a group which acts on itself and parameters are -invariant.
Since everything is finite a distribution allowed by a tree model may be represented as a vector where ’s are nonnegative and sum to one. Thus a tree model may be regarded as a map from the parameter space to the -dimensional vector space.
In algebraic phylogenetics we are interested in the geometric locus of all probability distributions allowed by a given model. Precisely, the Zariski closure of this locus is an algebraic variety and one is interested in its geometric and algebraic properties [Eri+04, Sul19].
For example one asks for polynomials defining the variety—so-called phylogenetic invariants—or properties of the singular set. In this article, we investigate the latter property, namely we show that for a well-known 3-Kimura model [Kim81], the singularities are always normal. This confirms a conjecture of Michałek [Mic13, Conjecture 9.5], [Mic15, Conjecture 12.1].
The -parameter Kimura model is a group-based model given by the group . Group-based models in general and the -Kimura model in particular have been recently intensively studied within algebraic statistics [SS05, BW07, MRV17, DE15, DK09, Mic11, Mic14, MV17a, CF08, CFM15, CFM17, Don16, MV17].
Apart from the fact that it was an open conjecture, there are several important reasons to study normality of the -Kimura model.
- •
Group-based models allow a monomial parametrization [SS05]. Thus, one may say that they are toric varieties. However, in pure mathematics one often requires a toric variety to be normal [Ful93]. The reason is that in such a case the variety admits a nice combinatorial description in terms of a fan [CLS11]. Our result in particular implies that the normal fan of the polytope associated to the -Kimura model describes the toric variety representing the model.
- •
Not all group-based models give rise to normal toric varieties: for example for the group one obtains a nonnormal variety [DM12]. The normality also fails for the -Kimura model. Thus the -Kimura model is distinguished with respect to that regard.
- •
Normality played an important role in the group-based model [SX10, BW07, SS05].
- •
Normality of toric varieties provides automatic bounds on degrees of phylogenetic invariants [Stu96]. In a special case of a tree with six leaves this was used in a recent proof [MV17a] of the Sturmfels-Sullivant conjecture [SS05, Conjecture 30]. On that example normality was checked by computer using software Normaliz [Bru+]. Our proof, in particular, confirms normality in this case without the necessity to rely on computer software.
It would not be possible to obtain our theorem without many great previous results. We list the most important below.
- •
Application of Discrete Fourier Transform to unravel toric structure The DFT may be considered as a clever change of coordinates, that changes the parametrization of the phylogenetic model into one given by monomials. First such applications were made by Handy and Penny [HP89]. The toric structure was studied in detail in the work of Sturmfels and Sullivant [SS05] and Michałek [Mic15].
- •
Reduction to claw trees Recall that a claw tree is a tree with just one inner vertex. It is known that one can extend many properties that hold for claw trees to arbitrary trees. This technique is well-developped to obtain phylogenetic invariants [DK09]. Further, it is known that normality in case of claw trees implies normality for arbitrary trees. For phylogenetic group-based models it was first observed in [Mic11, Lemma 5.1]. The joining of trees is a special case of a more general construction of toric fiber products [Sul07, RS16, EKS14].
- •
Facet description The vertex description of the polytopes representing group-based models are well-known [SS05, Mic11, BW07]. However, obtaining facet description from the vertex one is hard in the general case, and for phylogenetic models in particular. For the 3-Kimura model such a description was provided in [MRV17].
First two results of the above allow us to translate the question about projective normality of the variety associated to the 3-Kimura model into a purely combinatorial statement about normality of a family of polytopes. We prove the normality using only combinatorial methods. Strong tool is the facet description of the polytope because it allows us to prove that a point lies inside of the polytope by checking inequalities.
2. The polytope of the 3-Kimura model
We start by fixing notation.
Let be the group . Let us denote its elements by . We also denote the elements of by .
Let be the set of the group-based flows of length of , i.e.
[TABLE]
It is easy to see that is a subgroup of .
The goal of this article is to prove normality of a family of polytopes for 3-Kimura model indexed by . Before we formally define them, we introduce further notation.
We denote the coordinates of a point by where and . Although we are using upper indices, there will be no ambiguity since we will not use any powers in this article.
Definition 1**.**
We say that the -presentation of a point is an -tuple of multisets of elements of such that the element appears exactly times in the multiset . We may identify the -tuple with the -tuple of multisets .
Definition 2**.**
The vertices of are all points of which -presentations are the -tuples from . Therefore, is a convex hull of these points.
Equivalent characterization of is given in [MRV17]. The polytope is defined by the following inequalities:
- •
for all ,
- •
,
- •
For all with being an odd number:
[TABLE]
[TABLE]
[TABLE]
We denote the left sides of the last three inequalities by respectively. Each inequality gives us a facet of . We define
[TABLE]
The lattice generated by vertices of is
[TABLE]
where the last sum is in . Alternatively, we can characterize -presentations of points in as follows: Every multiset has the same size and sum of all elements in multisets is [math].
Definition 3**.**
Let be the vertex corresponding to the -tuple and be the vertex corresponding to the -tuple which has on -th and -th place and all other places 0.
Let be the following set of vertices of :
[TABLE]
Our goal is to prove that is normal for every positive integer . Let us recall that polytope is normal if every point in can be written as a sum of lattice points from . Normality of polytope is equivalent to the fact that the associated projective toric variety is projectively normal.
It is easy to check that , and are normal. Hence, in this article we consider only .
3. symmetries of
Polytope has a lot of symmetries that can be described by group actions on :
- •
Action of :
For and we define . Intuitively, we only permute quadruples of coordinates by the upper index.
- •
Action of :
For and and we define . Intuitively, if we look at -presentation of a point in we add to elements in .
- •
Action of :
For and we define . Again, if we consider -presentation of this is application of the automorphism to elements in multisets.
All of these actions only permute coordinates in and therefore are automorphisms of as a vector space. It can be easily verified that they map vertices of to vertices of and therefore preserve . It follows that these actions restricted to are automorphisms of this lattice.
We want to prove that every point decomposes to a sum of lattice points from . It is enough to prove it for an image of under any of group actions described above, since implies for any group action .
Let us define linear ordering on multisets of four real numbers with sum equal to . Consider two multisets and . Without loss of generality we may assume and . We say
[TABLE]
Consider . If we order multisets then by acting with corresponding permutation from we can ensure that multiset for is the smallest one in this ordering.
Let us denote the most frequent element (or one of the most frequent elements) in -th multiset from -presentation of , i.e. . Then by acting with we obtain a point in which the element [math] is the most frequent in all multisets except the last one.
This means that if we need to, for a point we may without loss of generality assume the following two facts:
[TABLE]
[TABLE]
We add another definition:
Definition 4**.**
Let . The vertex of is called -good if all coordinates of the point are non-negative.
4. Preliminary results
Lemma 1**.**
Let and . Suppose that and let . Then and the equality holds if and only if and for .
Proof.
[TABLE]
We divide by 3 and realise that are integers to obtain wanted inequality. The part about the equality is obvious. ∎
Lemma 2**.**
Let be a point such that for all . Let and be a set of odd cardinality. Then
[TABLE]
Proof.
We consider only the case , other cases are analogous.
Consider the homomorphism
[TABLE]
[TABLE]
For we get
[TABLE]
Therefore must be even. This implies
[TABLE]
[TABLE]
[TABLE]
∎
The following lemma implies that it is sufficient to consider only such points for which the following condition holds:
[TABLE]
Lemma 3**.**
Suppose that for every positive integers and every such that for all we can write as a sum of vertices of . Then is normal for every positive integer .
Proof.
Proof by induction on . and are normal.
Suppose that is normal. We prove that also is normal. Consider a point . If for all then decomposes by assumption. Therefore we may assume that for some . By acting with suitable permutation we can assume that and then by acting with we obtain .
Consider now the projection on the first coordinates. Since there exist positive real numbers with such that , where are some vertices of . But and implies for all .
Consequently, is a vertex of and . By induction hypothesis decomposes to , where are vertices of .
Now we simply put such that and for . Obviously all are vertices of and we have . ∎
Lemma 4**.**
Let . Then can be written in the form where are vertices of .
Proof.
We assume, without loss of generality, , and . Thus for . If also then . Further, must be a vertex of since it has non-negative coordinates and sum of elements in -presentation of is [math] since it is [math] for both and . If then by acting with suitable we have since for all by condition .
Since at least one of the numbers for ; is greater than 0. Then for such . By the same arguments as above must be a vertex of . ∎
Lemma 5**.**
Let be such that there are at least three multisets in -presentation of . Then , where is a vertex of and .
Proof.
By acting with suitable permutation from we may assume that these three multisets are the first three. Then by acting with suitable we may assume for . We describe the -presentation of (which is a -tuple of elements from ). We pick the last elements arbitrarily, the only condition is that belongs to the -th multiset from -presentation of . Then we pick and such that sum of this -tuple is 0 and are not all equal. Since and can be any from , it is possible.
Now we need to check that . We only need to check the inequalities for sets . However, if we try to compute we always get at least already on the first three coordinates. Therefore, due to Lemma 2 the inequalities hold. ∎
From now, we may assume that satisfies the following condition since the other case is covered by the previous lemma.
[TABLE]
Lemma 6**.**
Let satisfy , . Let be a set with . Then for any and any -good vertex of .
Proof.
Let be the set of those indices for which multisets in -presentation of are equal to . This together with condition yields for . Condition implies . It follows that
[TABLE]
∎
5. The proof
5.1. Idea of the proof
We prove for all positive integers that every point can be written in the form where and is a vertex of . This, of course, means that also since all vertices of belong to and this implies that is normal.
Consider a point . It is sufficient to consider because the case is solved by Lemma 4. Without loss of generality, from now we will suppose that satisfies , , and . To conclude we need to pick an -good vertex and then check that belongs to . We prove this by checking all inequalities from facet characterization of for every set with odd cardinality.
Regarding the vertex , we show we can always use some vertex as in Definition 3.
5.2. Big sets
Proposition 7**.**
Let satisfy and let be a set with odd cardinality.
- a)
If then for any and any -good vertex of .
- b)
If and satisfies then for any and any -good vertex of .
Proof.
Let . Clearly, it is sufficient to prove the inequality for . We begin with part a):
[TABLE]
The last inequality holds for . Case is covered in Lemma 6. We also used Lemma 1 and . Inequality together with Lemma 2 implies .
Proof of part b) is similar:
[TABLE]
where we again used Lemma 1. Lemma 2 implies that . Therefore the only bad case is when we have an equality. This is possible only if we have equality everywhere, in particular for all . But this means that does not satisfy which is a contradiction. ∎
Therefore it is sufficient to check inequalities for and such that .
5.3. Small sets
Since we have the inequalities for any and any set with odd cardinality. For big sets discussed in the section 5.2 we have not used them. However, we use them for smaller sets. Our first step is to observe how does change when we subtract some vertex from .
Lemma 8**.**
Let , , and or with . Then
[TABLE]
Moreover, for we have if and only if one of the following conditions holds:
- •
**
- •
* for any *
- •
* for and or *
Also .
Proof.
For the first part, one checks how many summands in will decrease by 1 when we subtract . The last part is clear consequence since for . ∎
Now we consider the following:
Proposition 9**.**
Let satisfy conditions . Suppose that [math] is also the most frequent element in the -th multiset from -presentation of . Then .
Proof.
Obviously, every multiset from -presentation of contains [math] so has non-negative coordinates and therefore is -good. Inequalities for sets with hold for by Proposition 7 since for sets with and we can use same arguments. Inequalities for hold by Lemma 8 since we are subtracting . It follows that . ∎
The previous proposition implies that we can assume that for satisfying also the following condition holds:
[TABLE]
Proposition 10**.**
Let satisfy . Then:
- a)
* does not belong to any facet for , i.e. for all such and .*
- b)
* for all , and .*
Proof.
We prove part a) by contradiction: Suppose that we have an equality for and . We may get to this situation by acting with suitable and . We compute :
[TABLE]
[TABLE]
An equality holds only if there is equality in all inequalities. In particular, it means that and . But from ordering of multisets, we get that also some for or . By acting with we get to the situation where [math] is the most frequent also in -th multiset and still is also most frequent on the first one. This is a contradiction with condition .
We continue with proof of part b). Part a) together with Lemma 2 implies that . Consequently, Lemma 8 implies for any . ∎
Proposition 11**.**
Let satisfy and . Then .
Proof.
Clearly, is -good. Inequalities for hold by Propositions 7 and 10. For we have , then by Lemma 8 we get . Since all inequalities hold . ∎
Therefore we are left only with the case .
5.4. Special case
In this case we will subtract a vertex for a special choice of and . Propositions 7 and 10 and Lemma 8 imply that it is enough to check inequalities for . We distinguish two cases depending on whether lies or does not lie on a facet for such .
Proposition 12**.**
Let satisfy , and does not belong to any facet for . Then there exists a vertex such that .
Proof.
For any Lemma 8 implies that for any with we have . We used Lemma 2 to deduce inequality . Therefore, inequalities for every set hold for any -good vertex , since bigger sets are taken care of by Propositions 7 and 10. Consequently, it is sufficient to pick any -good vertex .
At least two of the numbers for must be non-zero by condition and the fact that . Without loss of generality, let those two coordinates be and .
Since and , at least one of the numbers and for must be non-zero. Let it be . For all coordinates of are non-negative since condition implies for . This means we have found -good vertex and the proposition is proved. ∎
If belongs to a facet we prove that it belongs to only one facet and that we can as well subtract a vertex :
Proposition 13**.**
Let satisfy , and belongs to some facet for . Then
- a)
* belongs to only one such facet.*
- b)
There exists a vertex such that .
Proof.
By acting with suitable permutation from and we can get to situation where . We have
[TABLE]
To get an equality, there must be an equality in all inequalities, specifically for all and .
Assumption that belongs to a facet give us strong conditions. It is easy to see that cannot belong to some other facet for because it would imply . But this is a contradiction with condition . Also cannot belong to some for because it would imply for which is again a contradiction with . Same arguments hold for . This proves part a).
For part b), by the same arguments as in the proof of Proposition 12 for any -good vertex and set we have , except the case when and .
Since at least one of the numbers , must be greater than 0. Also one of the numbers and is greater than zero by condition .
If both numbers and are greater than zero for or then the vertex is -good. By Lemma 8 also and therefore .
Suppose the opposite, i.e. and (we can get to this case by acting with ). Then implies that at least one of the numbers for is greater than 0. Then we can subtract for such . Again has non-negative coordinates and by Lemma 8 .
∎
Theorem 14**.**
Polytope representing 3-Kimura model is normal for every positive integer .
Proof.
Consider point for some positive integer . If then decomposes due to Lemma 4. To prove normality of it is sufficient for to prove that there exists a vertex of such that . Also it is sufficient to consider only points which satisfy . The existence of such is implied by Lemma 5 and Propositions 9, 11, 12 and 13. ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[Bru+] W. Bruns et al. “Normaliz. Algorithms for rational cones and affine monoids”, Available at https://www.normaliz.uni-osnabrueck.de
- 2[BW 07] Weronika Buczyńska and Jarosław A. Wiśniewski “On geometry of binary symmetric models of phylogenetic trees” In J. Eur. Math. Soc. 9(3) , 2007, pp. 609–635
- 3[CF 08] Marta Casanellas and Jesús Fernández-Sánchez “Geometry of the Kimura 3-parameter model” In Advances in Applied Mathematics 41.3 Elsevier, 2008, pp. 265–292
- 4[CFM 15] Marta Casanellas, Jesús Fernández-Sánchez and Mateusz Michałek “Low degree equations for phylogenetic group-based models” In Collectanea Mathematica 66.2 Springer, 2015, pp. 203–225
- 5[CFM 17] Marta Casanellas, Jesús Fernández-Sánchez and Mateusz Michałek “Local equations for equivariant evolutionary models” In Advances in Mathematics 315 Elsevier, 2017, pp. 285–323
- 6[CLS 11] David A. Cox, John B. Little and Henry K. Schenck “Toric varieties” 124 , Graduate Studies in Mathematics American Mathematical Society, Providence, RI, 2011, pp. xxiv+841
- 7[DE 15] Jan Draisma and Rob H. Eggermont “Finiteness results for Abelian tree models” In J. Eur. Math. Soc. (JEMS) 17.4 , 2015, pp. 711–738
- 8[DK 09] Jan Draisma and Jochen Kuttler “On the ideals of equivariant tree models” In Math. Ann. 344(3) , 2009, pp. 619–644
