Efficient Distributed Locality Sensitive Hashing
Bahman Bahmani, Ashish Goel, Rajendra Shinde

TL;DR
This paper introduces the Layered LSH scheme, an improvement over Entropy LSH, that significantly reduces network traffic and improves runtime efficiency in distributed high-dimensional similarity search.
Contribution
It proposes the distributed Layered LSH scheme, which exponentially decreases network cost while maintaining load balance, building upon Entropy LSH for Euclidean space.
Findings
Reduces network traffic in distributed LSH by a large margin.
Maintains load balance across machines during search.
Achieves significant runtime improvements in real-world applications.
Abstract
Distributed frameworks are gaining increasingly widespread use in applications that process large amounts of data. One important example application is large scale similarity search, for which Locality Sensitive Hashing (LSH) has emerged as the method of choice, specially when the data is high-dimensional. At its core, LSH is based on hashing the data points to a number of buckets such that similar points are more likely to map to the same buckets. To guarantee high search quality, the LSH scheme needs a rather large number of hash tables. This entails a large space requirement, and in the distributed setting, with each query requiring a network call per hash bucket look up, this also entails a big network load. The Entropy LSH scheme proposed by Panigrahy significantly reduces the number of required hash tables by looking up a number of query offsets in addition to the query itself.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Caching and Content Delivery · Algorithms and Data Compression
