VICatMix: variational Bayesian clustering and variable selection for discrete biomedical data
Jackie Rao, Paul D. W. Kirk

TL;DR
VICatMix is a variational Bayesian clustering method for high-dimensional categorical biomedical data that improves efficiency, accuracy, and feature selection, aiding in cancer subtyping and gene discovery.
Contribution
It introduces a novel variational Bayesian finite mixture model with variable selection for categorical data, enhancing clustering performance and interpretability in biomedical applications.
Findings
Outperforms competitors in efficiency and accuracy
Effectively identifies relevant features in noisy data
Successfully applied to cancer datasets for subtyping and gene discovery
Abstract
Effective clustering of biomedical data is crucial in precision medicine, enabling accurate stratifiction of patients or samples. However, the growth in availability of high-dimensional categorical data, including `omics data, necessitates computationally efficient clustering algorithms. We present VICatMix, a variational Bayesian finite mixture model designed for the clustering of categorical data. The use of variational inference (VI) in its training allows the model to outperform competitors in term of efficiency, while maintaining high accuracy. VICatMix furthermore performs variable selection, enhancing its performance on high-dimensional, noisy data. The proposed model incorporates summarisation and model averaging to mitigate poor local optima in VI, allowing for improved estimation of the true number of clusters simultaneously with feature saliency. We demonstrate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Gene expression and cancer classification · AI in cancer detection
MethodsVariational Inference
