Outcome-guided Bayesian Clustering for Disease Subtype Discovery Using High-dimensional Transcriptomic Data
Lingsong Meng, Zhiguang Huo

TL;DR
This paper introduces a Bayesian clustering method that integrates clinical and high-dimensional transcriptomic data to discover disease subtypes that are clinically meaningful, with improved accuracy demonstrated on breast cancer data.
Contribution
The novel GuidedBayesianClustering method fully combines clinical outcomes and omics data for disease subtype discovery, feature selection, and outcome-guided clustering within a Bayesian framework.
Findings
Outperforms existing methods in simulations and breast cancer data analysis.
Effectively identifies clinically relevant disease subtypes.
Selects disease-related genes with controlled false discovery rate.
Abstract
The discovery of disease subtypes is an essential step for developing precision medicine, and disease subtyping via omics data has become a popular approach. While promising, subtypes obtained from conventional approaches may not be necessarily associated with clinical outcomes. The collection of rich clinical data along with omics data has provided an unprecedented opportunity to facilitate the disease subtyping process and to discovery clinically meaningful disease subtypes. Thus, we developed an outcome-guided Bayesian clustering (GuidedBayesianClustering) method to fully integrate the clinical data and the high-dimensional omics data. A Gaussian mixed model framework was applied to perform sample clustering; a spike-and-slab prior was utilized to perform gene selection; a mixture model prior was employed to incorporate the guidance from a clinical outcome variable; and a decision…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Bioinformatics and Genomic Networks · Bayesian Methods and Mixture Models
