Shrinkage Effect in Ancestral Maximum Likelihood
Elchanan Mossel, Sebastien Roch, Mike Steel

TL;DR
This paper proves that ancestral maximum likelihood (AML), a method for reconstructing phylogenetic trees and ancestral sequences, is statistically inconsistent because it tends to shrink short edges, leading to unresolved trees as data size increases.
Contribution
The paper provides the first formal proof that AML is statistically inconsistent due to its tendency to shrink short edges in phylogenetic trees.
Findings
AML can shrink short edges in phylogenetic trees
AML leads to unresolved trees with increasing data
Statistical inconsistency of AML is demonstrated
Abstract
Ancestral maximum likelihood (AML) is a method that simultaneously reconstructs a phylogenetic tree and ancestral sequences from extant data (sequences at the leaves). The tree and ancestral sequences maximize the probability of observing the given data under a Markov model of sequence evolution, in which branch lengths are also optimized but constrained to take the same value on any edge across all sequence sites. AML differs from the more usual form of maximum likelihood (ML) in phylogenetics because ML averages over all possible ancestral sequences. ML has long been known to be statistically consistent -- that is, it converges on the correct tree with probability approaching 1 as the sequence length grows. However, the statistical consistency of AML has not been formally determined, despite informal remarks in a literature that dates back 20 years. In this short note we prove a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvolution and Paleontology Studies · Genomics and Phylogenetic Studies · Genetic diversity and population structure
