Phylogenetic mixtures and linear invariants for equal input models
Marta Casanellas, Mike Steel

TL;DR
This paper studies the equal input model in phylogenetics, showing how linear invariants can restore tree identifiability in mixture models, and characterizes the structure of these invariants for any number of leaves.
Contribution
It extends the understanding of linear invariants in the equal input model, generalizing previous models and providing a detailed algebraic description of the space of mixtures and invariants.
Findings
Characterizes the space of mixtures for any fixed tree and all trees.
Provides a detailed description of invariants for four-leaf trees.
Builds on classic results to connect random processes and linear algebra in phylogenetics.
Abstract
The reconstruction of phylogenetic trees from molecular sequence data relies on modelling site substitutions by a Markov process, or a mixture of such processes. In general, allowing mixed processes can result in different tree topologies becoming indistinguishable from the data, even for infinitely long sequences. However, when the underlying Markov process supports linear phylogenetic invariants, then provided these are sufficiently informative, the identifiability of the tree topology can be restored. In this paper, we investigate a class of processes that support linear invariants once the stationary distribution is fixed, the `equal input model'. This model generalizes the `Felsenstein 1981' model (and thereby the Jukes--Cantor model) from four states to an arbitrary number of states (finite or infinite), and it can also be described by a `random cluster' process. We describe the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
