Randomized Dimensionality Reduction for k-means Clustering
Christos Boutsidis, Anastasios Zouzias, Michael W. Mahoney and, Petros Drineas

TL;DR
This paper introduces new provably accurate randomized methods for feature selection and extraction in k-means clustering, improving efficiency and theoretical guarantees over previous approaches.
Contribution
It presents the first provably accurate feature selection algorithm for k-means and two improved feature extraction methods based on random projections and approximate SVD.
Findings
First provably accurate feature selection method for k-means.
Random projection-based feature extraction with improved efficiency.
Fast approximate SVD-based feature extraction with better time complexity.
Abstract
We study the topic of dimensionality reduction for -means clustering. Dimensionality reduction encompasses the union of two approaches: \emph{feature selection} and \emph{feature extraction}. A feature selection based algorithm for -means clustering selects a small subset of the input features and then applies -means clustering on the selected features. A feature extraction based algorithm for -means clustering constructs a small set of new artificial features and then applies -means clustering on the constructed features. Despite the significance of -means clustering as well as the wealth of heuristic methods addressing it, provably accurate feature selection methods for -means clustering are not known. On the other hand, two provably accurate feature extraction methods for -means clustering are known in the literature; one is based on random projections and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Advanced Image and Video Retrieval Techniques · Advanced Clustering Algorithms Research
