A format for phylogenetic placements
Frederick A. Matsen, Noah G. Hoffman, Aaron Gallagher, Alexandros, Stamatakis

TL;DR
This paper introduces a standardized, lightweight JSON-based format for storing and sharing phylogenetic placement data, facilitating tool interoperability and advancing research in environmental sequencing analysis.
Contribution
The paper presents a unified, extensible format for phylogenetic placements, addressing the lack of standardization and improving tool development and data sharing.
Findings
Format is implemented in multiple tools
Works well in practice for parsimony and likelihood placements
Facilitates development of portable post-analysis tools
Abstract
We have developed a unified format for phylogenetic placements, that is, mappings of environmental sequence data (e.g. short reads) into a phylogenetic tree. We are motivated to do so by the growing number of tools for computing and post-processing phylogenetic placements, and the lack of an established standard for storing them. The format is lightweight, versatile, extensible, and is based on the JSON format which can be parsed by most modern programming languages. Our format is already implemented in several tools for computing and post-processing parsimony- and likelihood-based phylogenetic placements, and has worked well in practice. We believe that establishing a standard format for analyzing read placements at this early stage will lead to a more efficient development of powerful and portable post-analysis tools for the growing applications of phylogenetic placement.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
