Randomized Dimensionality Reduction for Facility Location and Single-Linkage Clustering
Shyam Narayanan, Sandeep Silwal, Piotr Indyk, Or Zamir

TL;DR
This paper demonstrates that random dimensionality reduction preserves the quality of solutions for facility location and single-linkage clustering problems, with bounds depending on intrinsic data dimension, enabling faster algorithms without significant loss of accuracy.
Contribution
The paper introduces dimension bounds based on intrinsic data complexity for preserving clustering solutions under random projections, extending previous work that depended on the number of clusters.
Findings
Dimensionality reduction approximates facility location costs within a constant factor.
Single-linkage clustering (minimum spanning tree) is preserved with near-perfect accuracy.
Experimental results confirm speedups and solution quality preservation.
Abstract
Random dimensionality reduction is a versatile tool for speeding up algorithms for high-dimensional problems. We study its application to two clustering problems: the facility location problem, and the single-linkage hierarchical clustering problem, which is equivalent to computing the minimum spanning tree. We show that if we project the input pointset onto a random -dimensional subspace (where is the doubling dimension of ), then the optimum facility location cost in the projected space approximates the original cost up to a constant factor. We show an analogous statement for minimum spanning tree, but with the dimension having an extra term and the approximation factor being arbitrarily close to . Furthermore, we extend these results to approximating solutions instead of just their costs. Lastly, we provide experimental results to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsFace and Expression Recognition · Advanced Clustering Algorithms Research · Bayesian Methods and Mixture Models
