Breaking the Variance: Approximating the Hamming Distance in $\tilde O(1/\epsilon)$ Time Per Alignment
Tsvi Kopelowitz, Ely Porat

TL;DR
This paper presents a novel algorithm that approximates the Hamming distance between a pattern and text in near-linear time relative to the inverse of the approximation error, significantly improving over previous methods.
Contribution
It introduces a new $ ilde O(n/ extepsilon)$ time algorithm for approximate Hamming distance computation, utilizing innovative sparse recovery and hashing techniques for pairwise character mismatches.
Findings
Achieves $ ilde O(n/ extepsilon)$ runtime for approximate Hamming distance
Introduces a new sparse recovery method for pair inputs
Develops a fast hashing/projection construction for mismatch counting
Abstract
The algorithmic tasks of computing the Hamming distance between a given pattern of length and each location in a text of length is one of the most fundamental algorithmic tasks in string algorithms. Unfortunately, there is evidence that for a text of size and a pattern of size , one cannot compute the exact Hamming distance for all locations in in time which is less than . However, Karloff~\cite{karloff} showed that if one is willing to suffer a approximation, then it is possible to solve the problem with high probability, in time. Due to related lower bounds for computing the Hamming distance of two strings in the one-way communication complexity model, it is strongly believed that obtaining an algorithm for solving the approximation version cannot be done much faster as a function of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Computational Geometry and Mesh Generation · Handwritten Text Recognition Techniques
