Sparse tree-based clustering of microbiome data to characterize microbiome heterogeneity in pancreatic cancer
Yushu Shi, Liangliang Zhang, Kim-Anh Do, Robert Jenq, Christine, Peterson

TL;DR
This paper introduces a Bayesian clustering method with feature selection and tree structure integration to identify microbiome-based subgroups in pancreatic cancer patients, improving over existing models.
Contribution
The novel unsupervised clustering approach incorporates feature selection, learns the number of clusters, and uses tree structure information, advancing microbiome data analysis.
Findings
Effective in simulated microbiome data
Identifies meaningful patient subgroups
Outperforms existing clustering methods
Abstract
There is a keen interest in characterizing variation in the microbiome across cancer patients, given increasing evidence of its important role in determining treatment outcomes. Here our goal is to discover subgroups of patients with similar microbiome profiles. We propose a novel unsupervised clustering approach in the Bayesian framework that innovates over existing model-based clustering approaches, such as the Dirichlet multinomial mixture model, in three key respects: we incorporate feature selection, learn the appropriate number of clusters from the data, and integrate information on the tree structure relating the observed features. We compare the performance of our proposed method to existing methods on simulated data designed to mimic real microbiome data. We then illustrate results obtained for our motivating data set, a clinical study aimed at characterizing the tumor…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Machine Learning in Healthcare · Colorectal Cancer Screening and Detection
