Annealed variational mixtures for disease subtyping and biomarker discovery
Emma Prevot, Rory Toogood, Filippo Pagani, Paul D. W. Kirk

TL;DR
This paper introduces an efficient annealed variational Bayes algorithm for high-dimensional mixture models, enabling disease subtyping and biomarker discovery in omics data, with superior performance demonstrated on biomedical datasets.
Contribution
The paper presents a novel scalable variational Bayes method with variable selection for high-dimensional clustering, improving disease subtyping and biomarker identification.
Findings
Outperforms existing methods in simulated and real data
Effective in cancer subtyping and biomarker discovery
Provides an open source Python implementation
Abstract
Cluster analyses of high-dimensional data are often hampered by the presence of large numbers of variables that do not provide relevant information, as well as the perennial issue of choosing an appropriate number of clusters. These challenges are frequently encountered when analysing `omics datasets, such as in molecular precision medicine, where a key goal is to identify disease subtypes and the biomarkers that define them. Here we introduce an annealed variational Bayes algorithm for fitting high-dimensional mixture models while performing variable selection. Our algorithm is scalable and computationally efficient, and we provide an open source Python implementation, VBVarSel. In a range of simulated and real biomedical examples, we show that VBVarSel outperforms the current state of the art, and demonstrate its use for cancer subtyping and biomarker discovery.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · AI in cancer detection
