On pattern matching with k mismatches and few don't cares
Marius Nicolae, Sanguthevar Rajasekaran

TL;DR
This paper introduces a new pattern matching algorithm that efficiently handles patterns with don't care characters and up to k mismatches, improving runtime especially when the pattern has few 'islands' without don't cares.
Contribution
The authors present an algorithm with runtime depending on the number of islands in the pattern, bridging the gap between existing methods for patterns with and without don't cares.
Findings
Runtime matches best known algorithms when the number of islands is small.
Algorithm outperforms previous methods when the number of islands is O(k^2).
Provides a unified approach for pattern matching with mismatches and don't cares.
Abstract
We consider the problem of pattern matching with mismatches, where there can be don't care or wild card characters in the pattern. Specifically, given a pattern of length and a text of length , we want to find all occurrences of in that have no more than mismatches. The pattern can have don't care characters, which match any character. Without don't cares, the best known algorithm for pattern matching with mismatches has a runtime of . With don't cares in the pattern, the best deterministic algorithm has a runtime of . Therefore, there is an important gap between the versions with and without don't cares. In this paper we give an algorithm whose runtime increases with the number of don't cares. We define an {\em island} to be a maximal length substring of that does not contain don't cares. Let be the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
