pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree
Frederick A Matsen, Robin B Kodner, E Virginia Armbrust

TL;DR
pplacer is a software tool that efficiently places large numbers of short sequencing reads onto a fixed reference phylogenetic tree using likelihood and Bayesian methods, with uncertainty quantification and visualization features.
Contribution
It introduces a linear-time algorithm for phylogenetic placement of sequences onto a fixed reference tree, enabling scalable likelihood-based analysis of large sequencing datasets.
Findings
Can place 20,000 reads per hour on a 1,000-taxa tree
Achieves high accuracy across diverse alignments
Provides reliable uncertainty estimates for placements
Abstract
Likelihood-based phylogenetic inference is generally considered to be the most reliable classification method for unknown sequences. However, traditional likelihood-based phylogenetic methods cannot be applied to large volumes of short reads from next-generation sequencing due to computational complexity issues and lack of phylogenetic signal. "Phylogenetic placement," where a reference tree is fixed and the unknown query sequences are placed onto the tree via a reference alignment, is a way to bring the inferential power of likelihood-based approaches to large data sets. This paper introduces pplacer, a software package for phylogenetic placement and subsequent visualization. The algorithm can place twenty thousand short reads on a reference tree of one thousand taxa per hour per processor, has essentially linear time and memory complexity in the number of reference taxa, and is easy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Evolution and Paleontology Studies · Genetic diversity and population structure
