A Comprehensive Survey on Deep Clustering: Taxonomy, Challenges, and Future Directions
Sheng Zhou, Hongjia Xu, Zhuonan Zheng, Jiawei Chen, Zhao li, Jiajun, Bu, Jia Wu, Xin Wang, Wenwu Zhu, Martin Ester

TL;DR
This survey comprehensively reviews deep clustering methods, categorizing approaches, discussing datasets and metrics, and highlighting future research challenges in the context of deep learning's impact on clustering tasks.
Contribution
It introduces a new taxonomy for deep clustering approaches, summarizes key components, and discusses practical applications and future research directions.
Findings
Deep clustering effectively combines representation learning and clustering.
A comprehensive taxonomy categorizes recent deep clustering methods.
Open-source tools and benchmarks facilitate future research.
Abstract
Clustering is a fundamental machine learning task which has been widely studied in the literature. Classic clustering methods follow the assumption that data are represented as features in a vectorized form through various representation learning techniques. As the data become increasingly complicated and complex, the shallow (traditional) clustering methods can no longer handle the high-dimensional data type. With the huge success of deep learning, especially the deep unsupervised learning, many representation learning techniques with deep architectures have been proposed in the past decade. Recently, the concept of Deep Clustering, i.e., jointly optimizing the representation learning and clustering, has been proposed and hence attracted growing attention in the community. Motivated by the tremendous success of deep learning in clustering, one of the most fundamental machine learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Domain Adaptation and Few-Shot Learning · Speech Recognition and Synthesis
