From Unsupervised to Guided Clustering: A Variational Implementation
Violaine Courrier (DATAVERS), Christophe Biernacki (DATAVERS)

TL;DR
This paper introduces GCVAE, a deep generative model that guides clustering using a variable to produce meaningful, context-specific clusters, demonstrated on image and health data.
Contribution
It formalizes guided clustering with a variational autoencoder that incorporates a guiding variable to steer the clustering process.
Findings
GCVAE learns a Gaussian Mixture Model structured latent space.
It produces coherent, task-relevant clusters in complex datasets.
Changing the guiding variable reorients the clustering results.
Abstract
Clustering is viewed as an unsupervised technique, but in practice it requires guidance to uncover meaningful structures. We formalize this with guided clustering, a paradigm that uses a guiding variable to steer the discovery process, and introduce the Guided Clustering Variational Autoencoder (GCVAE) as its deep generative realization. GCVAE learns a latent space structured as a Gaussian Mixture Model by optimizing a variational objective that forces the representation to be maximally informative about the guiding variable. This framework allows the resulting clustering to be reoriented by changing the guiding variable, yielding clusters that are meaningful for the specified context. Experiments on public (MNIST-SVHN) and proprietary connected health devices data demonstrate GCVAE's ability to discover coherent and task-relevant clusters in complex settings.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
