Identifiability of Large Phylogenetic Mixtures for Many Phylogenetic Model Structures
Bryson Kagy, Seth Sullivant

TL;DR
This paper proves the identifiability of several large phylogenetic mixture models, ensuring unique parameter recovery, which is crucial for accurate evolutionary analysis of DNA sequences.
Contribution
It extends previous theorems to establish identifiability for multiple complex phylogenetic mixture models, including Jukes-Cantor and Kimura models.
Findings
Proves identifiability for Jukes-Cantor model
Establishes identifiability for Kimura models
Extends main theorem of Rhodes and Sullivant 2012
Abstract
Identifiability of phylogenetic models is a necessary condition to ensure that the model parameters can be uniquely determined from data. Mixture models are phylogenetic models where the probability distributions in the model are convex combinations of distributions in simpler phylogenetic models. Mixture models are used to model heterogeneity in the substitution process in DNA sequences. While many basic phylogenetic models are known to be identifiable, mixture models in generality have only been shown to be identifiable in certain cases. We expand the main theorem of [Rhodes, Sullivant 2012] to prove identifiability of mixture models in equivariant phylogenetic models, specifically the Jukes-Cantor, Kimura 2-parameter model, Kimura 3-parameter model and the Strand Symmetric model.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Fractal and DNA sequence analysis · Biomedical Text Mining and Ontologies
