Sequence-Length Requirement of Distance-Based Phylogeny Reconstruction: Breaking the Polynomial Barrier
Sebastien Roch

TL;DR
This paper presents a novel distance-based phylogeny reconstruction method that achieves polylogarithmic sequence-length requirements at short branch lengths, significantly improving over previous polynomial bounds and extending phase transition results.
Contribution
The paper introduces a new averaging-based technique for phylogeny reconstruction that reduces sequence length requirements and extends phase transition analysis to general models.
Findings
Polylogarithmic sequence length suffices for reconstruction in certain regimes.
The new method outperforms previous polynomial-bound approaches.
Sequence data contains more information than previously believed.
Abstract
We introduce a new distance-based phylogeny reconstruction technique which provably achieves, at sufficiently short branch lengths, a polylogarithmic sequence-length requirement -- improving significantly over previous polynomial bounds for distance-based methods. The technique is based on an averaging procedure that implicitly reconstructs ancestral sequences. In the same token, we extend previous results on phase transitions in phylogeny reconstruction to general time-reversible models. More precisely, we show that in the so-called Kesten-Stigum zone (roughly, a region of the parameter space where ancestral sequences are well approximated by ``linear combinations'' of the observed sequences) sequences of length suffice for reconstruction when branch lengths are discretized. Here is the number of extant species. Our results challenge, to some extent, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
