The k-means-u* algorithm: non-local jumps and greedy retries improve   k-means++ clustering

Bernd Fritzke

arXiv:1706.09059·cs.LG·July 18, 2017·5 cites

The k-means-u* algorithm: non-local jumps and greedy retries improve k-means++ clustering

Bernd Fritzke

PDF

Open Access 1 Repo

TL;DR

The paper introduces k-means-u*, an enhanced clustering algorithm that improves upon k-means++ by using non-local jumps and retries, leading to better clustering solutions in Euclidean spaces.

Contribution

The paper proposes k-means-u*, a novel algorithm combining non-local jumps and greedy retries to improve clustering quality over k-means++, with theoretical guarantees.

Findings

01

k-means-u* outperforms k-means++ in solution quality on various datasets.

02

The algorithm often finds better local minima through non-local jumps.

03

Theoretical bounds similar to k-means++ apply to k-means-u*.

Abstract

We present a new clustering algorithm called k-means-u* which in many cases is able to significantly improve the clusterings found by k-means++, the current de-facto standard for clustering in Euclidean spaces. First we introduce the k-means-u algorithm which starts from a result of k-means++ and attempts to improve it with a sequence of non-local "jumps" alternated by runs of standard k-means. Each jump transfers the "least useful" center towards the center with the largest local error, offset by a small random vector. This is continued as long as the error decreases and often leads to an improved solution. Occasionally k-means-u terminates despite obvious remaining optimization possibilities. By allowing a limited number of retries for the last jump it is frequently possible to reach better local minima. The resulting algorithm is called k-means-u* and dominates k-means++ wrt.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gittar/k-means-u-star
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Mining Algorithms and Applications · Algorithms and Data Compression · Advanced Clustering Algorithms Research