Sparse Bayesian Unsupervised Learning
Stephane Gaiffas, Bertrand Michel

TL;DR
This paper introduces a Bayesian method for variable selection and clustering in high-dimensional unsupervised learning, utilizing constrained Gaussian mixture models with a sparsity prior, and demonstrates its theoretical optimality and efficient implementation.
Contribution
It develops a novel Bayesian framework for simultaneous variable selection and clustering, with proven optimality and a fast Metropolis-Hastings algorithm for practical use.
Findings
Proves a sparsity oracle inequality for the method.
Demonstrates fast convergence of the algorithm.
Effectively selects the number of clusters and relevant variables.
Abstract
This paper is about variable selection, clustering and estimation in an unsupervised high-dimensional setting. Our approach is based on fitting constrained Gaussian mixture models, where we learn the number of clusters and the set of relevant variables using a generalized Bayesian posterior with a sparsity inducing prior. We prove a sparsity oracle inequality which shows that this procedure selects the optimal parameters and . This procedure is implemented using a Metropolis-Hastings algorithm, based on a clustering-oriented greedy proposal, which makes the convergence to the posterior very fast.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Face and Expression Recognition · Speech Recognition and Synthesis
