Sequential Hypothesis Tests for Adaptive Locality Sensitive Hashing

Aniket Chakrabarti; Srinivasan Parthasarathy

arXiv:1412.3103·cs.IR·June 29, 2016

Sequential Hypothesis Tests for Adaptive Locality Sensitive Hashing

Aniket Chakrabarti, Srinivasan Parthasarathy

PDF

TL;DR

This paper introduces sequential hypothesis testing methods to improve the efficiency of Locality Sensitive Hashing (LSH) algorithms for high-dimensional similarity search, enabling more aggressive candidate pruning with controlled accuracy loss.

Contribution

It formulates sequential hypothesis tests for LSH, proposing a vanilla SPRT and two novel variants, including extensions for approximate similarity computation.

Findings

01

Sequential tests enable adaptive candidate pruning in LSH.

02

Proposed methods improve search efficiency while maintaining accuracy.

03

Extensions handle approximate similarity with confidence intervals.

Abstract

All pairs similarity search is a problem where a set of data objects is given and the task is to find all pairs of objects that have similarity above a certain threshold for a given similarity measure-of-interest. When the number of points or dimensionality is high, standard solutions fail to scale gracefully. Approximate solutions such as Locality Sensitive Hashing (LSH) and its Bayesian variants (BayesLSH and BayesLSHLite) alleviate the problem to some extent and provides substantial speedup over traditional index based approaches. BayesLSH is used for pruning the candidate space and computation of approximate similarity, whereas BayesLSHLite can only prune the candidates, but similarity needs to be computed exactly on the original data. Thus where ever the explicit data representation is available and exact similarity computation is not too expensive, BayesLSHLite can be used to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.