A phylogenetic scan test on Dirichlet-tree multinomial model for microbiome data
Yunfan Tang, Li Ma, Dan L. Niclolae

TL;DR
This paper introduces PhyloScan, a phylogenetic scan test for detecting differences in microbiome compositions across groups using a Dirichlet-tree multinomial model, improving detection power by leveraging phylogenetic structure.
Contribution
The paper develops a novel phylogenetic scan test (PhyloScan) that models microbiome data with a Dirichlet-tree multinomial approach and incorporates a scan statistic for enhanced detection of group differences.
Findings
PhyloScan outperforms existing methods in power on simulated data.
Application to the American Gut dataset identified diet-associated taxa.
The method effectively captures cluster-wise distributional differences along phylogenetic trees.
Abstract
In this paper we introduce the phylogenetic scan test (PhyloScan) for investigating cross-group differences in microbiome compositions using the Dirichlet-tree multinomial (DTM) model. DTM models the microbiome data through a cascade of independent local DMs on the internal nodes of the phylogenetic tree. Each of the local DMs captures the count distributions of a certain number of operational taxonomic units at a given resolution. Since distributional differences tend to occur in clusters along evolutionary lineages, we design a scan statistic over the phylogenetic tree to allow nodes to borrow signal strength from their parents and children. We also derive a formula to bound the tail probability of the scan statistic, and verify its accuracy through simulations. The PhyloScan procedure is applied to the American Gut dataset to identify taxa associated with diet habits. Empirical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Gut microbiota and health · Gene expression and cancer classification
