Practical Near Neighbor Search via Group Testing

Joshua Engels; Benjamin Coleman; Anshumali Shrivastava

arXiv:2106.11565·cs.DS·June 23, 2021·1 cites

Practical Near Neighbor Search via Group Testing

Joshua Engels, Benjamin Coleman, Anshumali Shrivastava

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces FLINNG, a novel approximate near neighbor search algorithm that combines group testing with locality-sensitive hashing, achieving faster query times and reduced memory usage in high-dimensional data tasks.

Contribution

The paper presents FLINNG, a new algorithm that reduces near neighbor search to group testing, offering practical advantages like single-pass construction and no distance computations.

Findings

01

FLINNG achieves up to 10x faster query speeds compared to HNSW and FAISS.

02

It requires less indexing time and memory.

03

The algorithm performs well on high-dimensional datasets like genome and image searches.

Abstract

We present a new algorithm for the approximate near neighbor problem that combines classical ideas from group testing with locality-sensitive hashing (LSH). We reduce the near neighbor search problem to a group testing problem by designating neighbors as "positives," non-neighbors as "negatives," and approximate membership queries as group tests. We instantiate this framework using distance-sensitive Bloom Filters to Identify Near-Neighbor Groups (FLINNG). We prove that FLINNG has sub-linear query time and show that our algorithm comes with a variety of practical advantages. For example, FLINNG can be constructed in a single pass through the data, consists entirely of efficient integer operations, and does not require any distance computations. We conduct large-scale experiments on high-dimensional search tasks such as genome search, URL similarity search, and embedding search over the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

joshengels/flinng
none

Videos

Practical Near Neighbor Search via Group Testing· slideslive

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · HIV Research and Treatment · SARS-CoV-2 detection and testing