Approximating Text-to-Pattern Hamming Distances
Timothy M. Chan, Shay Golan, Tomasz Kociumaka, Tsvi Kopelowitz, Ely, Porat

TL;DR
This paper introduces faster and more efficient approximation algorithms for computing Hamming distances between a pattern and text, including the first linear-time algorithm, improved exact algorithms, and streaming solutions.
Contribution
It presents novel approximation algorithms that are faster, simpler, and do not rely on FFT, including the first linear-time approximation and improved exact and streaming algorithms.
Findings
First linear-time approximation algorithm with $O(rac{1}{ ext{epsilon}^2} n)$ time.
Enhanced exact algorithms with logarithmic improvements and linear time for small thresholds.
Sublinear-time property tester and streaming algorithms for approximate Hamming distance detection.
Abstract
We revisit a fundamental problem in string matching: given a pattern of length m and a text of length n, both over an alphabet of size , compute the Hamming distance between the pattern and the text at every location. Several -approximation algorithms have been proposed in the literature, with running time of the form , all using fast Fourier transform (FFT). We describe a simple -approximation algorithm that is faster and does not need FFT. Combining our approach with additional ideas leads to numerous new results: - We obtain the first linear-time approximation algorithm; the running time is . - We obtain a faster exact algorithm computing all Hamming distances up to a given threshold k; its running time improves previous results by logarithmic factors and is linear if . - We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
