PDET-LSH: Scalable In-Memory Indexing for High-Dimensional Approximate Nearest Neighbor Search with Quality Guarantees
Jiuqi Wei, Xiaodong Lee, Botao Peng, Quanqing Xu, Chuanhui Yang, Themis Palpanas

TL;DR
This paper introduces PDET-LSH, a scalable in-memory indexing method for high-dimensional approximate nearest neighbor search that significantly improves indexing and query efficiency while maintaining theoretical accuracy guarantees.
Contribution
It proposes a novel encoding-based tree (DE-Tree) and a new LSH scheme (DET-LSH), along with a parallel in-memory version (PDET-LSH), enhancing efficiency and scalability in high-dimensional ANN search.
Findings
Up to 6x faster indexing compared to state-of-the-art methods.
Up to 62x faster query answering with same accuracy.
Probabilistic guarantees on query accuracy.
Abstract
Locality-sensitive hashing (LSH) is a well-known solution for approximate nearest neighbor (ANN) search with theoretical guarantees. Traditional LSH-based methods mainly focus on improving the efficiency and accuracy of query phase by designing different query strategies, but pay little attention to improving the efficiency of the indexing phase. They typically fine-tune existing data-oriented partitioning trees to index data points and support their query strategies. However, their strategy to directly partition the multidimensional space is time-consuming, and performance degrades as the space dimensionality increases. In this paper, we design an encoding-based tree called Dynamic Encoding Tree (DE-Tree) to improve the indexing efficiency and support efficient range queries. Based on DE-Tree, we propose a novel LSH scheme called DET-LSH. DET-LSH adopts a novel query strategy, which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Data Management and Algorithms · Algorithms and Data Compression
