On uniqueness of the set of k-means

Javier C\'arcamo; Antonio Cuevas; Luis A. Rodr\'iguez

arXiv:2410.13495·math.ST·October 18, 2024

On uniqueness of the set of k-means

Javier C\'arcamo, Antonio Cuevas, Luis A. Rodr\'iguez

PDF

Open Access

TL;DR

This paper establishes necessary and sufficient conditions for the uniqueness of k-means clustering for a probability distribution, analyzing the impact of the choice of k and providing statistical tools for testing uniqueness.

Contribution

It introduces a comprehensive framework for understanding k-means uniqueness, including asymptotic analysis, statistical characterizations, and a bootstrap test for practical assessment.

Findings

01

Conditions for k-means uniqueness are characterized.

02

A bootstrap test for assessing uniqueness is developed.

03

Simulation results demonstrate the effectiveness of the proposed methodology.

Abstract

We provide necessary and sufficient conditions for the uniqueness of the k-means set of a probability distribution. This uniqueness problem is related to the choice of k: depending on the underlying distribution, some values of this parameter could lead to multiple sets of k-means, which hampers the interpretation of the results and/or the stability of the algorithms. We give a general assessment on consistency of the empirical k-means adapted to the setting of non-uniqueness and determine the asymptotic distribution of the within cluster sum of squares (WCSS). We also provide statistical characterizations of k-means uniqueness in terms of the asymptotic behavior of the empirical WCSS. As a consequence, we derive a bootstrap test for uniqueness of the set of k-means. The results are illustrated with examples of different types of non-uniqueness and we check by simulations the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace and Expression Recognition

MethodsSparse Evolutionary Training