k2-means for fast and accurate large scale clustering

Eirikur Agustsson; Radu Timofte; Luc Van Gool

arXiv:1605.09299·cs.LG·May 31, 2016

k2-means for fast and accurate large scale clustering

Eirikur Agustsson, Radu Timofte, Luc Van Gool

PDF

TL;DR

k^2-means is a novel clustering algorithm that significantly accelerates large-scale clustering tasks by combining a new initialization method with an optimized assignment step, achieving faster convergence and comparable accuracy.

Contribution

It introduces k^2-means, a scalable clustering method with a new initialization and assignment strategy that reduces computational complexity for large datasets.

Findings

01

k^2-means is orders of magnitude faster than standard methods.

02

It achieves low energy solutions comparable to k-means++.

03

The method performs well on high-dimensional, large-cluster datasets.

Abstract

We propose k^2-means, a new clustering method which efficiently copes with large numbers of clusters and achieves low energy solutions. k^2-means builds upon the standard k-means (Lloyd's algorithm) and combines a new strategy to accelerate the convergence with a new low time complexity divisive initialization. The accelerated convergence is achieved through only looking at k_n nearest clusters and using triangle inequality bounds in the assignment step while the divisive initialization employs an optimal 2-clustering along a direction. The worst-case time complexity per iteration of our k^2-means is O(nk_nd+k^2d), where d is the dimension of the n data points and k is the number of clusters and usually n << k << k_n. Compared to k-means' O(nkd) complexity, our k^2-means complexity is significantly lower, at the expense of slightly increasing the memory complexity by O(nk_n+k^2). In our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.