On the LSH Distortion of Ulam and Cayley Similarities
Flavio Chierichetti, Mirko Giacchini, Ravi Kumar, Erasmo Tani

TL;DR
This paper investigates the LSH distortion of Ulam and Cayley similarities, revealing sublinear distortion bounds for Ulam and linear bounds for Cayley, advancing understanding of their suitability for hashing-based similarity search.
Contribution
It provides the first bounds on the LSH distortion of Ulam and Cayley similarities, including a sublinear upper bound for Ulam and a tight linear bound for Cayley.
Findings
Ulam similarity has a sublinear LSH distortion of O(n / sqrt(log n))
Lower bound of Ω(n^{0.12}) for Ulam similarity's LSH distortion
Cayley similarity's LSH distortion is Θ(n)
Abstract
Locality-sensitive hashing (LSH) has found widespread use as a fundamental primitive, particularly to accelerate nearest neighbor search. An LSH scheme for a similarity function is a distribution over hash functions on with the property that the probability of collision of any two elements is exactly equal to . However, not all similarity functions admit exact LSH schemes. The notion of LSH distortion measures how multiplicatively close a similarity function is to having an LSH scheme. In this work, we study the LSH distortion of the Ulam and Cayley similarities, which are popular similarity measures on permutations of elements. We show that the Ulam similarity admits a sublinear LSH distortion of ; we also prove a lower bound of on the best LSH…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
