Norm-Range Partition: A Universal Catalyst for LSH based Maximum Inner Product Search (MIPS)
Xiao Yan, Xinyan Dai, Jie Liu, Kaiwen Zhou, James Cheng

TL;DR
This paper introduces the norm-range partition technique for LSH-based MIPS, which partitions datasets by norm to improve query efficiency and reduce probing, demonstrating significant empirical improvements.
Contribution
The paper proposes a novel norm-range partition method that enhances existing LSH-based MIPS algorithms by reducing complexity and improving query efficiency.
Findings
Reduces query processing complexity for LSH-based MIPS
Significantly decreases the number of probes needed for the same recall
Applicable to multiple existing LSH algorithms with proven theoretical benefits
Abstract
Recently, locality sensitive hashing (LSH) was shown to be effective for MIPS and several algorithms including -ALSH, Sign-ALSH and Simple-LSH have been proposed. In this paper, we introduce the norm-range partition technique, which partitions the original dataset into sub-datasets containing items with similar 2-norms and builds hash index independently for each sub-dataset. We prove that norm-range partition reduces the query processing complexity for all existing LSH based MIPS algorithms under mild conditions. The key to performance improvement is that norm-range partition allows to use smaller normalization factor most sub-datasets. For efficient query processing, we also formulate a unified framework to rank the buckets from the hash indexes of different sub-datasets. Experiments on real datasets show that norm-range partition significantly reduces the number of probed for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · DNA and Biological Computing
