Efficient Similarity Indexing and Searching in High Dimensions
Yu Zhong

TL;DR
This paper introduces a novel high-dimensional data indexing method using random partitions, demonstrating superior efficiency and effectiveness over existing techniques like locality sensitive hashing on real datasets.
Contribution
The paper proposes a new random partition-based approach for high-dimensional indexing and searching, outperforming state-of-the-art methods in speed and accuracy.
Findings
Effective on datasets with hundreds of dimensions
Outperforms locality sensitive hashing in experiments
Suitable for high-dimensional feature spaces
Abstract
Efficient indexing and searching of high dimensional data has been an area of active research due to the growing exploitation of high dimensional data and the vulnerability of traditional search methods to the curse of dimensionality. This paper presents a new approach for fast and effective searching and indexing of high dimensional features using random partitions of the feature space. Experiments on both handwritten digits and 3-D shape descriptors have shown the proposed algorithm to be highly effective and efficient in indexing and searching real data sets of several hundred dimensions. We also compare its performance to that of the state-of-the-art locality sensitive hashing algorithm.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Data Management and Algorithms · Image Retrieval and Classification Techniques
