A Bayesian approach to inferring the phylogenetic structure of communities from metagenomic data
John O'Brien, Xavier Didelot, Zamin Iqbal, LucasAmenga-Etego, and Bartu Ahiska, Daniel Falush

TL;DR
This paper introduces a Bayesian statistical method for reconstructing the phylogenetic relationships of organisms within metagenomic samples, enabling better understanding of microbial evolution from complex environmental data.
Contribution
It presents a novel Bayesian approach that infers phylogenetic structure, haplotypes, and sample frequencies simultaneously from metagenomic data, addressing a gap in existing methods.
Findings
Successfully recovers phylogenetic structure from simulated data.
Accurately infers haplotypes and relationships in mixed samples.
Demonstrates applicability to real-world microbial datasets.
Abstract
Metagenomics provides a powerful new tool set for investigating evolutionary interactions with the environment. However, an absence of model-based statistical methods means that researchers are often not able to make full use of this complex information. We present a Bayesian method for inferring the phylogenetic relationship among related organisms found within metagenomic samples. Our approach exploits variation in the frequency of taxa among samples to simultaneously infer each lineage haplotype, the phylogenetic tree connecting them, and their frequency within each sample. Applications of the algorithm to simulated data show that our method can recover a substantial fraction of the phylogenetic structure even in the presence of strong mixing among samples. We provide examples of the method applied to data from green sulfur bacteria recovered from an Antarctic lake, plastids from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Microbial Community Ecology and Physiology · Genetic diversity and population structure
