The impact and interplay of long and short branches on phylogenetic information content
Iain Martyn, Mike Steel

TL;DR
This paper investigates how the lengths of long and short branches in phylogenetic trees affect the amount of sequence data needed for accurate reconstruction, extending previous results to more recent speciation events and complex models.
Contribution
It generalizes sequence length requirements for phylogenetic trees beyond four taxa and recent divergence, considering variable rates and broader models.
Findings
Sequence length requirements are similar for recent and ancient speciation if at least one taxon is distantly related.
Molecular clock assumptions reduce the importance of long outgroup branches in recent divergences.
Results extend to variable substitution rates and larger taxon sets.
Abstract
In molecular systematics, evolutionary trees are reconstructed from sequences at the tips under simple models of site substitution. A central question is how much sequence data is required to reconstruct a tree accurately? The answer depends on the lengths of the branches (edges) of the tree, with very short and very long edges requiring long sequences for accurate tree inference, particularly when these branch lengths are arranged in certain ways. For four-taxon trees, the sequence length question was settled for the case of a rapid speciation event in the distant past. Here, we generalize this result and show that the same sequence length requirement holds even when the speciation event is recent, provided that at least one of the four taxa is distantly related to the others. However, this equivalence disappears if a molecular clock applies, since the length of the long outgroup edge…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvolution and Paleontology Studies · Genomics and Phylogenetic Studies · Genetic diversity and population structure
