Combinatorics of least squares trees
Radu Mihaescu, Lior Pachter

TL;DR
This paper explores the combinatorial properties of least squares methods in phylogenetics, providing new formulas, characterizations, and an efficient algorithm for estimating evolutionary trees.
Contribution
It introduces a complete characterization of weighted least squares methods satisfying a desirable property, based on a multiplicative four point condition, and offers a time optimal computation algorithm.
Findings
Characterization of methods satisfying the property via a four point condition
Complete generalization of previous least squares and evolution models
Development of a time optimal algorithm for tree estimation
Abstract
A recurring theme in the least squares approach to phylogenetics has been the discovery of elegant combinatorial formulas for the least squares estimates of edge lengths. These formulas have proved useful for the development of efficient algorithms, and have also been important for understanding connections among popular phylogeny algorithms. For example, the selection criterion of the neighbor-joining algorithm is now understood in terms of the combinatorial formulas of Pauplin for estimating tree length. We highlight a phylogenetically desirable property that weighted least squares methods should satisfy, and provide a complete characterization of methods that satisfy the property. The necessary and sufficient condition is a multiplicative four point condition that the the variance matrix needs to satisfy. The proof is based on the observation that the Lagrange multipliers in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
