Strong Consistency of Sparse K-means Clustering
Jeungju Kim, Johan Lim

TL;DR
This paper proves the strong consistency of sparse K-means clustering in high-dimensional spaces, extending results to various distance metrics and penalty constraints, ensuring reliable clustering outcomes.
Contribution
It establishes the strong consistency of sparse K-means clustering for high-dimensional data under Euclidean and non-Euclidean distances, including models with l0 and l1 penalties.
Findings
Proves consistency in risk and clustering for Euclidean distance.
Characterizes the clustering limit in special cases.
Extends results to non-Euclidean distances and other penalty models.
Abstract
In this paper, we study the strong consistency of the sparse K-means clustering for high dimensional data. We prove the consistency in both risk and clustering for the Euclidean distance. We discuss the characterization of the limit of the clustering under some special cases. For the general (non-Euclidean) distance, we prove the consistency in risk. Our result naturally extends to other models with the same objective function but different constraints such as l0 or l1 penalty in recent literature.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Advanced Clustering Algorithms Research · Text and Document Classification Technologies
