Comprehensive cluster validity Index based on structural simplicity
Anri Mutoh, Masamichi Wada, Kou Amano

TL;DR
This paper introduces a new cluster validity index called the simplicity index, which evaluates the structural simplicity of clusters and addresses limitations of existing indices by being invariant to scale and unbiased towards the worst partition.
Contribution
The paper proposes the simplicity index, a novel CVI that measures cluster simplicity and overcomes limitations of existing indices regarding scale invariance and bias.
Findings
Existing CVIs do not fulfill desired properties such as scale invariance.
The simplicity index is invariant to scale shifts and unbiased.
The simplicity index effectively measures the structural simplicity of clusters.
Abstract
Nonhierarchical clustering depending on unsupervised algorithms may not retrieve the optimal partition of datasets. Determining if clusters fit ``natural partitions`` can be achieved using cluster validity indices (CVIs). Most existing CVIs consider criteria such as cohesion, separation, and their equivalents. However, these binary relations may provide neither the optimal measure of partition suitability nor reference values corresponding to the worst partition. Moreover, previous CVI studies have been mostly focused on fitting correct partitions according to researchers' a priori assumptions. In contrast, we investigated desirable properties of CVIs, namely, scale shift transform invariance, optimal clustering, and unbiased clustering with representing the worst partition. Then, we conducted experiments to evaluate whether existing CVIs fulfill these properties. As none of these CVIs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Complex Network Analysis Techniques · Bayesian Methods and Mixture Models
