Robust Clustering Using Outlier-Sparsity Regularization

Pedro A. Forero; Vassilis Kekatos; Georgios B. Giannakis

arXiv:1104.4512·stat.ML·May 27, 2015

Robust Clustering Using Outlier-Sparsity Regularization

Pedro A. Forero, Vassilis Kekatos, Georgios B. Giannakis

PDF

TL;DR

This paper introduces robust clustering algorithms that simultaneously identify outliers and cluster data effectively, leveraging outlier sparsity and regularization, with proven convergence and applicability to high-dimensional data.

Contribution

It presents novel outlier-aware robust clustering methods using sparsity regularization and block coordinate descent, including kernelized versions for high-dimensional data.

Findings

01

Robust algorithms outperform traditional methods on synthetic and real datasets.

02

Effective outlier detection integrated with clustering improves reliability.

03

Kernelized versions handle high-dimensional and nonlinear data effectively.

Abstract

Notwithstanding the popularity of conventional clustering algorithms such as K-means and probabilistic clustering, their clustering results are sensitive to the presence of outliers in the data. Even a few outliers can compromise the ability of these algorithms to identify meaningful hidden structures rendering their outcome unreliable. This paper develops robust clustering algorithms that not only aim to cluster the data, but also to identify the outliers. The novel approaches rely on the infrequent presence of outliers in the data which translates to sparsity in a judiciously chosen domain. Capitalizing on the sparsity in the outlier domain, outlier-aware robust K-means and probabilistic clustering approaches are proposed. Their novelty lies on identifying outliers while effecting sparsity in the outlier domain through carefully chosen regularization. A block coordinate descent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.