Random Projections for $k$-means Clustering
Christos Boutsidis, Anastasios Zouzias, Petros Drineas

TL;DR
This paper demonstrates that random projections can efficiently reduce the dimensionality of data for $k$-means clustering while approximately preserving the optimal clustering, supported by theoretical proofs and empirical validation.
Contribution
The paper introduces a method using random projections to reduce dimensions for $k$-means clustering, with provable guarantees on preserving the optimal partition within a factor of $2+ ext{eps}$.
Findings
Projection dimension $t = ext{Omega}(k / ext{eps}^2)$ suffices.
Algorithm runs in $O(n d ext{eps}^{-2} k / ext{log}(d))$ time.
Experiments confirm speed and accuracy on large datasets.
Abstract
This paper discusses the topic of dimensionality reduction for -means clustering. We prove that any set of points in dimensions (rows in a matrix ) can be projected into dimensions, for any , in time, such that with constant probability the optimal -partition of the point set is preserved within a factor of . The projection is done by post-multiplying with a random matrix having entries or with equal probability. A numerical implementation of our technique and experiments on a large face images dataset verify the speed and the accuracy of our theoretical results.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Sparse and Compressive Sensing Techniques · Topological and Geometric Data Analysis
