Identifiability of a Markovian model of molecular evolution with Gamma-distributed rates
Elizabeth S. Allman, Cecile Ane, John A. Rhodes

TL;DR
This paper proves that the widely used GTR+Gamma model in molecular evolution is identifiable for generic parameters, including all parameters in 4-state models, establishing a key theoretical foundation for phylogenetic inference with rate heterogeneity.
Contribution
It provides the first rigorous proof of identifiability for a phylogenetic model with a continuous distribution of rates, specifically the GTR+Gamma model.
Findings
GTR+Gamma model is identifiable for generic parameters.
Identifiability holds for all parameters in 4-state models.
First proof of identifiability with continuous rate distribution.
Abstract
Inference of evolutionary trees and rates from biological sequences is commonly performed using continuous-time Markov models of character change. The Markov process evolves along an unknown tree while observations arise only from the tips of the tree. Rate heterogeneity is present in most real data sets and is accounted for by the use of flexible mixture models where each site is allowed its own rate. Very little has been rigorously established concerning the identifiability of the models currently in common use in data analysis, although non-identifiability was proven for a semi-parametric model and an incorrect proof of identifiability was published for a general parametric model (GTR+Gamma+I). Here we prove that one of the most widely used models (GTR+Gamma) is identifiable for generic parameters, and for all parameter choices in the case of 4-state (DNA) models. This is the first…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvolution and Genetic Dynamics
