How to Achieve the Intended Aim of Deep Clustering Now, without Deep Learning
Kai Ming Ting, Wei-Jie Xu, Hang Zhang

TL;DR
This paper questions whether deep clustering methods like DEC truly overcome $k$-means limitations and finds that a non-deep learning approach using distributional data can achieve similar or better results.
Contribution
The paper demonstrates that leveraging data distribution explicitly can match or surpass deep clustering methods without using deep learning.
Findings
Non-deep methods can effectively address $k$-means limitations.
Distributional information is key to clustering performance.
Deep clustering does not inherently overcome fundamental limitations.
Abstract
Deep clustering (DC) is often quoted to have a key advantage over -means clustering. Yet, this advantage is often demonstrated using image datasets only, and it is unclear whether it addresses the fundamental limitations of -means clustering. Deep Embedded Clustering (DEC) learns a latent representation via an autoencoder and performs clustering based on a -means-like procedure, while the optimization is conducted in an end-to-end manner. This paper investigates whether the deep-learned representation has enabled DEC to overcome the known fundamental limitations of -means clustering, i.e., its inability to discover clusters of arbitrary shapes, varied sizes and densities. Our investigations on DEC have a wider implication on deep clustering methods in general. Notably, none of these methods exploit the underlying data distribution. We uncover that a non-deep learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Stochastic Gradient Optimization Techniques · Domain Adaptation and Few-Shot Learning
