Wide Gaps and Clustering Axioms
Mieczys{\l}aw A. K{\l}opotek

TL;DR
This paper introduces new clusterability properties and extensions of k-means to reconcile it with Kleinberg's axioms, enhancing theoretical understanding and practical dataset construction for clustering evaluation.
Contribution
It proposes variational and residual k-separability as new clusterability properties and extends k-means to satisfy Kleinberg's axioms, bridging theory and practice.
Findings
k-means violates Kleinberg's consistency axiom without clusterability assumptions
New clusterability properties ensure k-means aligns with axiomatic expectations
Method for constructing clusterable datasets with known global optima
Abstract
The widely applied k-means algorithm produces clusterings that violate our expectations with respect to high/low similarity/density and is in conflict with Kleinberg's axiomatic system for distance based clustering algorithms that formalizes those expectations in a natural way. k-means violates in particular the consistency axiom. We hypothesise that this clash is due to the not explicated expectation that the data themselves should have the property of being clusterable in order to expect the algorithm clustering hem to fit a clustering axiomatic system. To demonstrate this, we introduce two new clusterability properties, variational k-separability and residual k-separability and show that then the Kleinberg's consistency axiom holds for k-means operating in the Euclidean or non-Euclidean space. Furthermore, we propose extensions of k-means algorithm that fit approximately the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Bayesian Methods and Mixture Models · Topological and Geometric Data Analysis
