Properties of Consensus Methods for Inferring Species Trees from Gene Trees
James H. Degnan

TL;DR
This paper analyzes the theoretical properties of different consensus methods for inferring species trees from gene trees, showing that R* is consistent while majority-rule is reliable, and discussing the behavior of greedy consensus.
Contribution
It provides a theoretical comparison of consensus methods, demonstrating the statistical consistency of R* and the potential pitfalls of greedy consensus in species tree inference.
Findings
R* consensus converges to the true species tree as gene number increases
Majority-rule consensus is not misleading and is consistent
Greedy consensus can be misleading despite quick convergence
Abstract
Consensus methods provide a useful strategy for combining information from a collection of gene trees. An important application of consensus methods is to combine gene trees to estimate a species tree. To investigate the theoretical properties of consensus trees that would be obtained from large numbers of loci evolving according to a basic evolutionary model, we construct consensus trees from independent gene trees that occur in proportion to gene tree probabilities derived from coalescent theory. We consider majority-rule, rooted triple (R*), and greedy consensus trees constructed from known gene trees, both in the asymptotic case as numbers of gene trees approach infinity and for finite numbers of genes. Our results show that for some combinations of species tree branch lengths, increasing the number of independent loci can make the majority-rule consensus tree more likely to be at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Bioinformatics and Genomic Networks · Genetic diversity and population structure
