Efficient Similarity Indexing and Searching in High Dimensions

Yu Zhong

arXiv:1505.03090·cs.IR·May 13, 2015

Efficient Similarity Indexing and Searching in High Dimensions

Yu Zhong

PDF

Open Access

TL;DR

This paper introduces a novel high-dimensional data indexing method using random partitions, demonstrating superior efficiency and effectiveness over existing techniques like locality sensitive hashing on real datasets.

Contribution

The paper proposes a new random partition-based approach for high-dimensional indexing and searching, outperforming state-of-the-art methods in speed and accuracy.

Findings

01

Effective on datasets with hundreds of dimensions

02

Outperforms locality sensitive hashing in experiments

03

Suitable for high-dimensional feature spaces

Abstract

Efficient indexing and searching of high dimensional data has been an area of active research due to the growing exploitation of high dimensional data and the vulnerability of traditional search methods to the curse of dimensionality. This paper presents a new approach for fast and effective searching and indexing of high dimensional features using random partitions of the feature space. Experiments on both handwritten digits and 3-D shape descriptors have shown the proposed algorithm to be highly effective and efficient in indexing and searching real data sets of several hundred dimensions. We also compare its performance to that of the state-of-the-art locality sensitive hashing algorithm.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Data Management and Algorithms · Image Retrieval and Classification Techniques