Longest common substrings with k mismatches
Tomas Flouri, Emanuele Giaquinta, Kassian Kobert, Esko, Ukkonen

TL;DR
This paper presents a practical linear-time, constant-space algorithm for finding the longest common substrings with up to k mismatches, and offers an improved theoretical solution for the case when k=1.
Contribution
It introduces a practical $O(nm)$ time, $O(1)$ space algorithm for the longest common substring with k mismatches, and a faster $O(n ext{log} m)$ solution for k=1.
Findings
Practical $O(nm)$ time, $O(1)$ space algorithm for general k.
Theoretical $O(n ext{log} m)$ time solution for k=1.
Improves over previous $O(nm)$ algorithms for k=1.
Abstract
The longest common substring with -mismatches problem is to find, given two strings and , a longest substring of and of such that the Hamming distance between and is . We introduce a practical time and space solution for this problem, where and are the lengths of and , respectively. This algorithm can also be used to compute the matching statistics with -mismatches of and in time and space. Moreover, we also present a theoretical solution for the case which runs in time, assuming , and uses space, improving over the existing time and space bound of Babenko and Starikovskaya.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
