A note on the longest common substring with $k$-mismatches problem
Szymon Grabowski

TL;DR
This paper introduces new algorithms for the longest common substring with k-mismatches problem, achieving subquadratic time for certain k values and providing practical solutions for sequence comparison.
Contribution
The paper presents two output-dependent algorithms and a tabulation-based method, extending subquadratic solutions to a broader range of k values in the k-LCF problem.
Findings
At least one algorithm runs in subquadratic time for k = O(log^{1-ε} n).
Algorithms operate in O(n) space and can be chosen after linear preprocessing.
Tabulation-based algorithm achieves near-quadratic time depending on parameters.
Abstract
The recently introduced longest common substring with -mismatches (-LCF) problem is to find, given two sequences and of length each, a longest substring of and of such that the Hamming distance between and is at most . So far, the only subquadratic time result for this problem was known for ~\cite{FGKU2014}. We first present two output-dependent algorithms solving the -LCF problem and show that for , where , at least one of them works in subquadratic time, using words of space. The choice of one of these two algorithms to be applied for a given input can be done after linear time and space preprocessing. Finally we present a tabulation-based algorithm working, in its range of applicability, in time, where is the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Coding theory and cryptography · Genome Rearrangement Algorithms
