A Bayesian Semiparametric Mixture Model for Clustering Zero-Inflated Microbiome Data
Suppapat Korsurat, Matthew D. Koslovsky

TL;DR
This paper introduces a Bayesian semiparametric mixture model tailored for zero-inflated microbiome data, enabling automatic determination of the number of clusters and improving clustering accuracy over existing methods.
Contribution
The novel model accommodates zero-inflation and learns the number of clusters simultaneously, addressing limitations of prior clustering approaches in microbiome research.
Findings
Outperforms existing clustering methods in simulations
Effectively identifies meaningful microbiome-based clusters
Highlights importance of modeling zero-inflation
Abstract
Microbiome research has immense potential for unlocking insights into human health and disease. A common goal in human microbiome research is identifying subgroups of individuals with similar microbial composition that may be linked to specific health states or environmental exposures. However, existing clustering methods are often not equipped to accommodate the complex structure of microbiome data and typically make limiting assumptions regarding the number of clusters in the data which can bias inference. Designed for zero-inflated multivariate compositional count data collected in microbiome research, we propose a novel Bayesian semiparametric mixture modeling framework that simultaneously learns the number of clusters in the data while performing cluster allocation. In simulation, we demonstrate the clustering performance of our method compared to distance- and model-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Gut microbiota and health
