To Cluster, or Not to Cluster: An Analysis of Clusterability Methods

A. Adolfsson; M. Ackerman; and N. C. Brownstein

arXiv:1808.08317·stat.ML·October 30, 2018

To Cluster, or Not to Cluster: An Analysis of Clusterability Methods

A. Adolfsson, M. Ackerman, and N. C. Brownstein

PDF

TL;DR

This paper compares various methods for assessing whether data has meaningful cluster structure, providing guidelines to help users select appropriate measures for their specific applications.

Contribution

It offers an extensive comparison of clusterability measures and practical guidelines for their selection, addressing the variability in existing methods.

Findings

01

Identifies key differences among clusterability measures

02

Provides recommendations for measure selection based on data characteristics

03

Enhances understanding of when to apply clustering analysis

Abstract

Clustering is an essential data mining tool that aims to discover inherent cluster structure in data. For most applications, applying clustering is only appropriate when cluster structure is present. As such, the study of clusterability, which evaluates whether data possesses such structure, is an integral part of cluster analysis. However, methods for evaluating clusterability vary radically, making it challenging to select a suitable measure. In this paper, we perform an extensive comparison of measures of clusterability and provide guidelines that clustering users can reference to select suitable measures for their applications.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.