Deep Clustering Evaluation: How to Validate Internal Clustering Validation Measures
Zeya Wang, Chenglong Ye

TL;DR
This paper critically examines the challenges of evaluating deep clustering methods, proposing a theoretical framework and systematic approach to improve the reliability of internal validation measures in high-dimensional deep learning contexts.
Contribution
It introduces a theoretical framework and systematic methodology for applying clustering validation measures effectively in deep clustering, addressing issues caused by data embedding and model variability.
Findings
The proposed framework aligns better with external validation measures.
It reduces the misguidance caused by improper use of validation indices.
Experiments confirm improved evaluation consistency in deep clustering.
Abstract
Deep clustering, a method for partitioning complex, high-dimensional data using deep neural networks, presents unique evaluation challenges. Traditional clustering validation measures, designed for low-dimensional spaces, are problematic for deep clustering, which involves projecting data into lower-dimensional embeddings before partitioning. Two key issues are identified: 1) the curse of dimensionality when applying these measures to raw data, and 2) the unreliable comparison of clustering results across different embedding spaces stemming from variations in training procedures and parameter settings in different clustering models. This paper addresses these challenges in evaluating clustering quality in deep learning. We present a theoretical framework to highlight ineffectiveness arising from using internal validation measures on raw and embedded data and propose a systematic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research
