Determining species tree topologies from clade probabilities under the coalescent
Elizabeth S. Allman, James H. Degnan, John A. Rhodes

TL;DR
This paper investigates how clade probabilities derived from gene trees under the multispecies coalescent model can be used to accurately reconstruct the species tree topology, addressing limitations of existing methods.
Contribution
It demonstrates that certain clade probability features and linear invariants can uniquely identify the species tree topology, advancing theoretical understanding of species tree inference.
Findings
Clades with probability > 1/3 reflect species tree clades.
Linear invariants help identify species tree topology.
Clade probabilities contain full information on the species tree.
Abstract
One approach to estimating a species tree from a collection of gene trees is to first estimate probabilities of clades from the gene trees, and then to construct the species tree from the estimated clade probabilities. While a greedy consensus algorithm, which consecutively accepts the most probable clades compatible with previously accepted clades, can be used for this second stage, this method is known to be statistically inconsistent under the multispecies coalescent model. This raises the question of whether it is theoretically possible to reconstruct the species tree from known probabilities of clades on gene trees. We investigate clade probabilities arising from the multispecies coalescent model, with an eye toward identifying features of the species tree. Clades on gene trees with probability greater than 1/3 are shown to reflect clades on the species tree, while those with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
