Twisted trees and inconsistency of tree estimation when gaps are treated as missing data -- the impact of model mis-specification in distance corrections
Emily Jane McTavish, Mike Steel, Mark T. Holder

TL;DR
This paper investigates how model mis-specification, especially treating gaps as missing data, can lead to inconsistent phylogenetic tree estimation, demonstrating that certain distance correction functions can favor incorrect trees under mild conditions.
Contribution
It extends previous results on inconsistency in tree estimation, introducing the concept of twisted Farris-zone trees and analyzing the impact of gap treatment on distance functions.
Findings
Convex corrected distance functions can favor incorrect trees.
Treating gaps as missing data can produce non-linear distance functions.
Inconsistent tree inference can occur even with unlimited data.
Abstract
Statistically consistent estimation of phylogenetic trees or gene trees is possible if pairwise sequence dissimilarities can be converted to a set of distances that are proportional to the true evolutionary distances. Susko et al. (2004) reported some strikingly broad results about the forms of inconsistency in tree estimation that can arise if corrected distances are not proportional to the true distances. They showed that if the corrected distance is a concave function of the true distance, then inconsistency due to long branch attraction will occur. If these functions are convex, then two "long branch repulsion" trees will be preferred over the true tree -- though these two incorrect trees are expected to be tied as the preferred true. Here we extend their results, and demonstrate the existence of a tree shape (which we refer to as a "twisted Farris-zone" tree) for which a single…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
