Faster Clustering via Preprocessing

Tsvi Kopelowitz; Robert Krauthgamer

arXiv:1208.5247·cs.DS·August 28, 2012·1 cites

Faster Clustering via Preprocessing

Tsvi Kopelowitz, Robert Krauthgamer

PDF

Open Access

TL;DR

This paper introduces fast algorithms for clustering queries in metric spaces with low doubling dimension, utilizing preprocessing to significantly reduce query times for objectives like p-center and p-median.

Contribution

It presents novel preprocessing-based algorithms that enable near-linear query times for clustering in low doubling dimension metric spaces, improving efficiency over previous methods.

Findings

01

Query time is near-linear in the size of the query set.

02

Preprocessing reduces dependence on the total number of points.

03

Algorithms work for standard clustering objectives like p-center and p-median.

Abstract

We examine the efficiency of clustering a set of points, when the encompassing metric space may be preprocessed in advance. In computational problems of this genre, there is a first stage of preprocessing, whose input is a collection of points $M$ ; the next stage receives as input a query set $Q \subset M$ , and should report a clustering of $Q$ according to some objective, such as 1-median, in which case the answer is a point $a \in M$ minimizing $\sum_{q \in Q} d_{M} (a, q)$ . We design fast algorithms that approximately solve such problems under standard clustering objectives like $p$ -center and $p$ -median, when the metric $M$ has low doubling dimension. By leveraging the preprocessing stage, our algorithms achieve query time that is near-linear in the query size $n = ∣ Q ∣$ , and is (almost) independent of the total number of points $m = ∣ M ∣$ .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Data Management and Algorithms · Computational Geometry and Mesh Generation