Deep clustering with concrete k-means
Boyan Gao, Yongxin Yang, Henry Gouk, Timothy M. Hospedales

TL;DR
This paper introduces a novel deep clustering method called concrete k-means that jointly learns feature representations and clustering assignments in an end-to-end manner using a differentiable approximation of k-means.
Contribution
It develops a gradient estimator for k-means using Gumbel-Softmax, enabling direct end-to-end training without alternating optimization, improving deep clustering performance.
Findings
Outperforms traditional methods on standard benchmarks.
Enables end-to-end training of deep clustering models.
Provides a differentiable k-means objective approximation.
Abstract
We address the problem of simultaneously learning a k-means clustering and deep feature representation from unlabelled data, which is of interest due to the potential of deep k-means to outperform traditional two-step feature extraction and shallow-clustering strategies. We achieve this by developing a gradient-estimator for the non-differentiable k-means objective via the Gumbel-Softmax reparameterisation trick. In contrast to previous attempts at deep clustering, our concrete k-means model can be optimised with respect to the canonical k-means objective and is easily trained end-to-end without resorting to alternating optimisation. We demonstrate the efficacy of our method on standard clustering benchmarks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Anomaly Detection Techniques and Applications
Methodsk-Means Clustering
