Streaming dictionary matching with mismatches

Pawe{\l} Gawrychowski; Tatiana Starikovskaya

arXiv:1809.02517·cs.DS·June 22, 2021

Streaming dictionary matching with mismatches

Pawe{\l} Gawrychowski, Tatiana Starikovskaya

PDF

TL;DR

This paper extends efficient streaming algorithms from the $k$-mismatch problem to the more complex dictionary matching with $k$ mismatches, providing new algorithms with specific space and time bounds and establishing lower bounds.

Contribution

It introduces a novel streaming algorithm for dictionary matching with $k$ mismatches and proves a lower bound on space complexity for this problem.

Findings

01

Developed a randomized streaming algorithm with $O(k d ext{polylog}(n))$ space.

02

Achieved $O(k ext{polylog}(n) + | ext{occ}|)$ time per position.

03

Proved a lower bound of $ ext{Omega}(k d)$ bits of space for any streaming algorithm.

Abstract

In the $k$ -mismatch problem we are given a pattern of length $n$ and a text and must find all locations where the Hamming distance between the pattern and the text is at most $k$ . A series of recent breakthroughs have resulted in an ultra-efficient streaming algorithm for this problem that requires only $O (k lo g \frac{n}{k})$ space and $O (lo g \frac{n}{k} (k lo g k + lo g^{3} n))$ time per letter [Clifford, Kociumaka, Porat, SODA 2019]. In this work, we consider a strictly harder problem called dictionary matching with $k$ mismatches. In this problem, we are given a dictionary of $d$ patterns, where the length of each pattern is at most $n$ , and must find all substrings of the text that are within Hamming distance $k$ from one of the patterns. We develop a streaming algorithm for this problem with $O (k d lo g^{k} d polylog (n))$ space and $O(k \log^{k} d \mathrm{polylog}(n)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.