Balancing clusters to reduce response time variability in large scale   image search

Romain Tavenard (INRIA - IRISA); Laurent Amsaleg (INRIA - IRISA),; Herv\'e J\'egou (INRIA - IRISA)

arXiv:1009.4739·cs.CV·September 27, 2010·1 cites

Balancing clusters to reduce response time variability in large scale image search

Romain Tavenard (INRIA - IRISA), Laurent Amsaleg (INRIA - IRISA),, Herv\'e J\'egou (INRIA - IRISA)

PDF

Open Access

TL;DR

This paper introduces a modified k-means clustering algorithm that balances cluster sizes to reduce response time variability in large-scale image search, improving consistency without compromising search quality.

Contribution

It proposes a novel modification to k-means that produces more balanced clusters, addressing response time variability in high-dimensional approximate nearest neighbor search.

Findings

01

Significantly reduces response time variance

02

Maintains high search quality

03

Effective on large-scale image descriptor datasets

Abstract

Many algorithms for approximate nearest neighbor search in high-dimensional spaces partition the data into clusters. At query time, in order to avoid exhaustive search, an index selects the few (or a single) clusters nearest to the query point. Clusters are often produced by the well-known $k$ -means approach since it has several desirable properties. On the downside, it tends to produce clusters having quite different cardinalities. Imbalanced clusters negatively impact both the variance and the expectation of query response times. This paper proposes to modify $k$ -means centroids to produce clusters with more comparable sizes without sacrificing the desirable properties. Experiments with a large scale collection of image descriptors show that our algorithm significantly reduces the variance of response times without seriously impacting the search quality.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization · Image Retrieval and Classification Techniques