Strong Consistency of Sparse K-means Clustering

Jeungju Kim; Johan Lim

arXiv:2501.09983·math.ST·April 15, 2025

Strong Consistency of Sparse K-means Clustering

Jeungju Kim, Johan Lim

PDF

Open Access

TL;DR

This paper proves the strong consistency of sparse K-means clustering in high-dimensional spaces, extending results to various distance metrics and penalty constraints, ensuring reliable clustering outcomes.

Contribution

It establishes the strong consistency of sparse K-means clustering for high-dimensional data under Euclidean and non-Euclidean distances, including models with l0 and l1 penalties.

Findings

01

Proves consistency in risk and clustering for Euclidean distance.

02

Characterizes the clustering limit in special cases.

03

Extends results to non-Euclidean distances and other penalty models.

Abstract

In this paper, we study the strong consistency of the sparse K-means clustering for high dimensional data. We prove the consistency in both risk and clustering for the Euclidean distance. We discuss the characterization of the limit of the clustering under some special cases. For the general (non-Euclidean) distance, we prove the consistency in risk. Our result naturally extends to other models with the same objective function but different constraints such as l0 or l1 penalty in recent literature.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace and Expression Recognition · Advanced Clustering Algorithms Research · Text and Document Classification Technologies