The streaming $k$-mismatch problem
Rapha\"el Clifford, Tomasz Kociumaka, Ely Porat

TL;DR
This paper introduces a streaming algorithm for the $k$-mismatch problem that efficiently computes approximate pattern matches with low space and near-optimal time, extending previous open problems.
Contribution
It presents a novel streaming algorithm for the $k$-mismatch problem with near-optimal space and time complexity, resolving an open problem from FOCS'09.
Findings
Uses $O(k ext{log}n ext{log}rac{n}{k})$ bits of space
Processes each input symbol in nearly optimal time
Provides a deterministic encoding for alignments with Hamming distance at most $k$
Abstract
We consider the streaming complexity of a fundamental task in approximate pattern matching: the -mismatch problem. It asks to compute Hamming distances between a pattern of length and all length- substrings of a text for which the Hamming distance does not exceed a given threshold . In our problem formulation, we report not only the Hamming distance but also, on demand, the full \emph{mismatch information}, that is the list of mismatched pairs of symbols and their indices. The twin challenges of streaming pattern matching derive from the need both to achieve small working space and also to guarantee that every arriving input symbol is processed quickly. We present a streaming algorithm for the -mismatch problem which uses bits of space and spends \ourcomplexity time on each symbol of the input stream, which consists of the pattern followed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · DNA and Biological Computing · semigroups and automata theory
