Estimating phylogenetic trees from genome-scale data
Liang Liu, Zhenxiang Xi, Shaoyuan Wu, Charles Davis, Scott V. Edwards

TL;DR
This paper reviews the challenges of reconstructing the Tree of Life from large genomic datasets, emphasizing the advantages of species tree methods over concatenation, and discusses recent theoretical and empirical insights into their performance.
Contribution
It clarifies conflicts between concatenation and species tree methods, and highlights the robustness and computational efficiency of modern species tree approaches for genome-scale data.
Findings
Species tree methods are more robust to heterogeneity and long branch attraction.
Concatenation can be inconsistent and distort gene tree distributions.
Species tree methods incorporating biological realism are essential for genome-scale phylogenetics.
Abstract
As researchers collect increasingly large molecular data sets to reconstruct the Tree of Life, the heterogeneity of signals in the genomes of diverse organisms poses challenges for traditional phylogenetic analysis. A class of phylogenetic methods known as "species tree methods" have been proposed to directly address one important source of gene tree heterogeneity, namely the incomplete lineage sorting or deep coalescence that occurs when evolving lineages radiate rapidly, resulting in a diversity of gene trees from a single underlying species tree. Although such methods are gaining in popularity, they are being adopted with caution in some quarters, in part because of an increasing number of examples of strong phylogenetic conflict between concatenation or supermatrix methods and species tree methods. Here we review theory and empirical examples that help clarify these conflicts.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
