Provably Fast and Accurate Recovery of Evolutionary Trees through Harmonic Greedy Triplets
Miklos Csuros, Ming-Yang Kao

TL;DR
This paper introduces a greedy algorithm that efficiently reconstructs evolutionary trees from sequence data using harmonic averages on triplets, achieving optimal time and space complexity and high accuracy under the Jukes-Cantor model.
Contribution
The paper presents a novel harmonic greedy triplet algorithm for evolutionary tree reconstruction with proven optimal complexity and high-probability correctness under standard evolutionary models.
Findings
Algorithm runs in O(n^2) time and O(n) space.
Recovers correct tree topology with high probability.
Requires polynomial-length sequences for accurate reconstruction.
Abstract
We give a greedy learning algorithm for reconstructing an evolutionary tree based on a certain harmonic average on triplets of terminal taxa. After the pairwise distances between terminal taxa are estimated from sequence data, the algorithm runs in O(n^2) time using O(n) work space, where n is the number of terminal taxa. These time and space complexities are optimal in the sense that the size of an input distance matrix is n^2 and the size of an output tree is n. Moreover, in the Jukes-Cantor model of evolution, the algorithm recovers the correct tree topology with high probability using sample sequences of length polynomial in (1) n, (2) the logarithm of the error probability, and (3) the inverses of two small parameters.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Algorithms and Data Compression
