Maximum Parsimony on Subsets of Taxa
Mareike Fischer, Bhalchandra D. Thatte

TL;DR
This paper explores the accuracy of Fitch's maximum parsimony algorithm in reconstructing ancestral states, revealing that using subsets of taxa can sometimes improve accuracy and confirming a conjecture under specific models.
Contribution
It demonstrates that applying maximum parsimony to subsets of taxa can outperform using all taxa and proves a conjecture relating single-taxon accuracy to overall reconstruction reliability under a molecular clock.
Findings
Subset analysis can improve ancestral state reconstruction accuracy.
Ignoring closer taxa can sometimes enhance reliability.
Under a molecular clock, single-taxon accuracy bounds overall method performance.
Abstract
In this paper we investigate mathematical questions concerning the reliability (reconstruction accuracy) of Fitch's maximum parsimony algorithm for reconstructing the ancestral state given a phylogenetic tree and a character. In particular, we consider the question whether the maximum parsimony method applied to a subset of taxa can reconstruct the ancestral state of the root more accurately than when applied to all taxa, and we give an example showing that this indeed is possible. A surprising feature of our example is that ignoring a taxon closer to the root improves the reliability of the method. On the other hand, in the case of the two-state symmetric substitution model, we answer affirmatively a conjecture of Li, Steel and Zhang which states that under a molecular clock the probability that the state at a single taxon is a correct guess of the ancestral state is a lower bound on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopological and Geometric Data Analysis · Stochastic processes and statistical mechanics · Bioinformatics and Genomic Networks
