TL;DR
This paper introduces a new randomized algorithm for the Closest Pair Problem in Hamming metric that outperforms previous methods for small dimensions, matches existing complexity for larger dimensions, and simplifies analysis while being practically effective.
Contribution
The paper presents a novel randomized algorithm for the Hamming closest pair problem that improves theoretical complexity and offers a modular, simplified analysis adaptable to various input distributions.
Findings
Outperforms previous algorithms in small dimensions
Matches best-known complexity in moderate to large dimensions
Shows promising practical performance in initial implementation
Abstract
We study the Closest Pair Problem in Hamming metric, which asks to find the pair with the smallest Hamming distance in a collection of binary vectors. We give a new randomized algorithm for the problem on uniformly random input outperforming previous approaches whenever the dimension of input points is small compared to the dataset size. For moderate to large dimensions, our algorithm matches the time complexity of the previously best-known locality sensitive hashing based algorithms. Technically our algorithm follows similar design principles as Dubiner (IEEE Trans. Inf. Theory 2010) and May-Ozerov (Eurocrypt 2015). Besides improving the time complexity in the aforementioned areas, we significantly simplify the analysis of these previous works. We give a modular analysis, which allows us to investigate the performance of the algorithm also on non-uniform input distributions.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
