The K-modes algorithm for clustering

Miguel \'A. Carreira-Perpi\~n\'an; Weiran Wang

arXiv:1304.6478·cs.LG·April 25, 2013·22 cites

The K-modes algorithm for clustering

Miguel \'A. Carreira-Perpi\~n\'an, Weiran Wang

PDF

Open Access

TL;DR

The paper introduces the K-modes clustering algorithm that identifies meaningful modes as cluster centers, effectively handling nonconvex clusters and outliers, with computational efficiency comparable to K-means.

Contribution

It defines a new K-modes objective combining density and cluster assignment, providing a robust clustering method that finds valid pattern centroids unlike traditional algorithms.

Findings

01

K-modes finds meaningful cluster modes even in nonconvex data.

02

It is robust to outliers and scale/misspecification.

03

Computationally faster than mean-shift and K-medoids.

Abstract

Many clustering algorithms exist that estimate a cluster centroid, such as K-means, K-medoids or mean-shift, but no algorithm seems to exist that clusters data by returning exactly K meaningful modes. We propose a natural definition of a K-modes objective function by combining the notions of density and cluster assignment. The algorithm becomes K-means and K-medoids in the limit of very large and very small scales. Computationally, it is slightly slower than K-means but much faster than mean-shift or K-medoids. Unlike K-means, it is able to find centroids that are valid patterns, truly representative of a cluster, even with nonconvex clusters, and appears robust to outliers and misspecification of the scale and number of clusters.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Clustering Algorithms Research · Bayesian Methods and Mixture Models · Face and Expression Recognition