Fast phylogeny reconstruction through learning of ancestral sequences
Radu Mihaescu, Cameron Hill, Satish Rao

TL;DR
This paper introduces a fast, reliable method for phylogeny reconstruction that learns ancestral sequences, improving efficiency from previous polynomial-time algorithms while maintaining accuracy under limited DNA sequence data.
Contribution
It develops a new algorithm combining ancestral sequence learning with a minimum spanning tree approach, achieving faster reconstruction of phylogenetic trees without prior edge length bounds.
Findings
Runs in $O(n^3)$ time, significantly faster than previous methods.
Maintains sequence length requirements for accurate full tree reconstruction.
Provides reliable sub-forest reconstruction without known edge length bounds.
Abstract
Given natural limitations on the length DNA sequences, designing phylogenetic reconstruction methods which are reliable under limited information is a crucial endeavor. There have been two approaches to this problem: reconstructing partial but reliable information about the tree (\cite{Mo07, DMR08,DHJ06,GMS08}), and reaching "deeper" in the tree through reconstruction of ancestral sequences. In the latter category, \cite{DMR06} settled an important conjecture of M.Steel, showing that, under the CFN model of evolution, all trees on leaves with edge lengths bounded by the Ising model phase transition can be recovered with high probability from genomes of length with a polynomial time algorithm. Their methods had a running time of . Here we enhance our methods from \cite{DHJ06} with the learning of ancestral sequences and provide an algorithm for reconstructing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Genomics and Phylogenetic Studies · Genome Rearrangement Algorithms
