Phylogenetic information complexity: Is testing a tree easier than   finding it?

Mike Steel; Laszlo Szekely; Elchanan Mossel

arXiv:0807.1756·q-bio.PE·July 14, 2008

Phylogenetic information complexity: Is testing a tree easier than finding it?

Mike Steel, Laszlo Szekely, Elchanan Mossel

PDF

Open Access

TL;DR

This paper investigates the data complexity of testing versus reconstructing phylogenetic trees, revealing that under certain models, testing can be as data-efficient as reconstruction, while under others, it remains easier regardless of tree size.

Contribution

It provides an analytical comparison of the information requirements for testing and reconstructing phylogenetic trees under different models, highlighting when testing is computationally easier.

Findings

01

Testing can require similar data as reconstruction under some models.

02

In certain models, testing complexity is independent of the number of species.

03

Reconstruction complexity grows with the number of species in some models.

Abstract

Phylogenetic trees describe the evolutionary history of a group of present-day species from a common ancestor. These trees are typically reconstructed from aligned DNA sequence data. In this paper we analytically address the following question: is the amount of sequence data required to accurately reconstruct a tree significantly more than the amount required to test whether or not a candidate tree was the `true' tree? By `significantly', we mean that the two quantities behave the same way as a function of the number of species being considered. We prove that, for a certain type of model, the amount of information required is not significantly different; while for another type of model, the information required to test a tree is independent of the number of leaves, while that required to reconstruct it grows with this number. Our results combine probabilistic and combinatorial arguments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenomics and Phylogenetic Studies · Fractal and DNA sequence analysis · Biomedical Text Mining and Ontologies