Identifiability of the unrooted species tree topology under the coalescent model with time-reversible substitution processes, site-specific rate variation, and invariable sites
Julia Chifman, Laura Kubatko

TL;DR
This paper proves that the unrooted species tree topology can be generically identified from DNA sequence data under a coalescent model with time-reversible substitutions, rate variation, and invariable sites, addressing key inference challenges.
Contribution
It establishes the formal identifiability of the unrooted species tree topology under complex evolutionary models, which was previously unproven.
Findings
Unrooted species tree topology is identifiable from sequence data.
Identifiability holds under models with rate variation and invariable sites.
Provides theoretical foundation for phylogenetic inference methods.
Abstract
The inference of the evolutionary history of a collection of organisms is a problem of fundamental importance in evolutionary biology. The abundance of DNA sequence data arising from genome sequencing projects has led to significant challenges in the inference of these phylogenetic relationships. Among these challenges is the inference of the evolutionary history of a collection of species based on sequence information from several distinct genes sampled throughout the genome. It is widely accepted that each individual gene has its own phylogeny, which may not agree with the species tree. Many possible causes of this gene tree incongruence are known. The best studied is incomplete lineage sorting, which is commonly modeled by the coalescent process. Numerous methods based on the coalescent process have been proposed for estimation of the phylogenetic species tree given DNA sequence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
