Breaking the Variance: Approximating the Hamming Distance in $\tilde   O(1/\epsilon)$ Time Per Alignment

Tsvi Kopelowitz; Ely Porat

arXiv:1512.04515·cs.DS·December 15, 2015

Breaking the Variance: Approximating the Hamming Distance in $\tilde O(1/\epsilon)$ Time Per Alignment

Tsvi Kopelowitz, Ely Porat

PDF

Open Access

TL;DR

This paper presents a novel algorithm that approximates the Hamming distance between a pattern and text in near-linear time relative to the inverse of the approximation error, significantly improving over previous methods.

Contribution

It introduces a new $ ilde O(n/ extepsilon)$ time algorithm for approximate Hamming distance computation, utilizing innovative sparse recovery and hashing techniques for pairwise character mismatches.

Findings

01

Achieves $ ilde O(n/ extepsilon)$ runtime for approximate Hamming distance

02

Introduces a new sparse recovery method for pair inputs

03

Develops a fast hashing/projection construction for mismatch counting

Abstract

The algorithmic tasks of computing the Hamming distance between a given pattern of length $m$ and each location in a text of length $n$ is one of the most fundamental algorithmic tasks in string algorithms. Unfortunately, there is evidence that for a text $T$ of size $n$ and a pattern $P$ of size $m$ , one cannot compute the exact Hamming distance for all locations in $T$ in time which is less than $\tilde{O} (n m)$ . However, Karloff~\cite{karloff} showed that if one is willing to suffer a $1 \pm ϵ$ approximation, then it is possible to solve the problem with high probability, in $\tilde{O} (\frac{n}{ϵ ^{2}})$ time. Due to related lower bounds for computing the Hamming distance of two strings in the one-way communication complexity model, it is strongly believed that obtaining an algorithm for solving the approximation version cannot be done much faster as a function of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · Computational Geometry and Mesh Generation · Handwritten Text Recognition Techniques