Adaptively Robust and Sparse K-means Clustering

Hao Li; Shonosuke Sugasawa; Shota Katayama

arXiv:2407.06945·stat.CO·November 8, 2024·Trans. Mach. Learn. Res.·2 cites

Adaptively Robust and Sparse K-means Clustering

Hao Li, Shonosuke Sugasawa, Shota Katayama

PDF

Open Access 1 Repo

TL;DR

This paper introduces ARSK, a novel clustering method that enhances K-means by improving robustness to outliers and selecting informative variables in high-dimensional data, using penalized error components and weights.

Contribution

It proposes a new adaptive robust and sparse K-means algorithm with penalized error components and weights, optimized via Gap statistics, to handle outliers and noisy variables simultaneously.

Findings

01

ARSK outperforms existing algorithms in simulations.

02

It effectively identifies clusters without outliers.

03

It selects informative variables in high-dimensional data.

Abstract

While K-means is known to be a standard clustering algorithm, its performance may be compromised due to the presence of outliers and high-dimensional noisy variables. This paper proposes adaptively robust and sparse K-means clustering (ARSK) to address these practical limitations of the standard K-means algorithm. For robustness, we introduce a redundant error component for each observation, and this additional parameter is penalized using a group sparse penalty. To accommodate the impact of high-dimensional noisy variables, the objective function is modified by incorporating weights and implementing a penalty to control the sparsity of the weight vector. The tuning parameters to control the robustness and sparsity are selected by Gap statistics. Through simulation experiments and real data analysis, we demonstrate the proposed method's superiority to existing algorithms in identifying…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lee1995hao/arsk
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Clustering Algorithms Research · Face and Expression Recognition

Methodsk-Means Clustering