Identifiability of 2-tree mixtures for group-based models
Elizabeth S. Allman, Sonja Petrovi\'c, John A. Rhodes, Seth Sullivant

TL;DR
This paper investigates the identifiability of 2-tree mixture models in phylogenetics, demonstrating that for certain DNA models like JC and K2P, the tree parameters can be uniquely determined, unlike simpler 2-state models.
Contribution
It extends previous non-identifiability results to more complex 4-state models, showing that key DNA models have identifiable parameters in 2-tree mixtures.
Findings
Tree parameters are identifiable for JC and K2P models.
Generic substitution parameters are identifiable for JC models.
Identifiability results hold for mixtures on the same tree in K2P and K3P models.
Abstract
Phylogenetic data arising on two possibly different tree topologies might be mixed through several biological mechanisms, including incomplete lineage sorting or horizontal gene transfer in the case of different topologies, or simply different substitution processes on characters in the case of the same topology. Recent work on a 2-state symmetric model of character change showed such a mixture model has non-identifiable parameters, and thus it is theoretically impossible to determine the two tree topologies from any amount of data under such circumstances. Here the question of identifiability is investigated for 2-tree mixtures of the 4-state group-based models, which are more relevant to DNA sequence data. Using algebraic techniques, we show that the tree parameters are identifiable for the JC and K2P models. We also prove that generic substitution parameters for the JC mixture models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Glycosylation and Glycoproteins Research · Algorithms and Data Compression
