Practical Near Neighbor Search via Group Testing
Joshua Engels, Benjamin Coleman, Anshumali Shrivastava

TL;DR
This paper introduces FLINNG, a novel approximate near neighbor search algorithm that combines group testing with locality-sensitive hashing, achieving faster query times and reduced memory usage in high-dimensional data tasks.
Contribution
The paper presents FLINNG, a new algorithm that reduces near neighbor search to group testing, offering practical advantages like single-pass construction and no distance computations.
Findings
FLINNG achieves up to 10x faster query speeds compared to HNSW and FAISS.
It requires less indexing time and memory.
The algorithm performs well on high-dimensional datasets like genome and image searches.
Abstract
We present a new algorithm for the approximate near neighbor problem that combines classical ideas from group testing with locality-sensitive hashing (LSH). We reduce the near neighbor search problem to a group testing problem by designating neighbors as "positives," non-neighbors as "negatives," and approximate membership queries as group tests. We instantiate this framework using distance-sensitive Bloom Filters to Identify Near-Neighbor Groups (FLINNG). We prove that FLINNG has sub-linear query time and show that our algorithm comes with a variety of practical advantages. For example, FLINNG can be constructed in a single pass through the data, consists entirely of efficient integer operations, and does not require any distance computations. We conduct large-scale experiments on high-dimensional search tasks such as genome search, URL similarity search, and embedding search over the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · HIV Research and Treatment · SARS-CoV-2 detection and testing
