Norm-Ranging LSH for Maximum Inner Product Search
Xiao Yan, Jinfeng Li, Xinyan Dai, Hongzhi Chen, James Cheng

TL;DR
This paper introduces Norm-ranging LSH, a novel hashing method for maximum inner product search that partitions datasets to improve performance and reduce query time, outperforming previous methods significantly.
Contribution
The paper proposes a new dataset partitioning approach in hashing for MIPS, addressing long-tail issues and improving efficiency over Simple-LSH.
Findings
Norm-ranging LSH reduces query time compared to Simple-LSH.
Partitioning datasets enhances hashing performance for MIPS.
Experiments demonstrate an order of magnitude speedup for the same recall.
Abstract
Neyshabur and Srebro proposed Simple-LSH, which is the state-of-the-art hashing method for maximum inner product search (MIPS) with performance guarantee. We found that the performance of Simple-LSH, in both theory and practice, suffers from long tails in the 2-norm distribution of real datasets. We propose Norm-ranging LSH, which addresses the excessive normalization problem caused by long tails in Simple-LSH by partitioning a dataset into multiple sub-datasets and building a hash index for each sub-dataset independently. We prove that Norm-ranging LSH has lower query time complexity than Simple-LSH. We also show that the idea of partitioning the dataset can improve other hashing based methods for MIPS. To support efficient query processing on the hash indexes of the sub-datasets, a novel similarity metric is formulated. Experiments show that Norm-ranging LSH achieves an order of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Caching and Content Delivery · Algorithms and Data Compression
