Streaming k-mismatch with error correcting and applications
Jakub Radoszewski, Tatiana Starikovskaya

TL;DR
This paper introduces a new streaming algorithm for the k-Mismatch problem with error correction, applicable to weighted strings in biological data, achieving near-optimal space complexity and comparable efficiency to prior solutions.
Contribution
It presents a novel streaming algorithm with error correcting for k-Mismatch, extending to weighted strings and demonstrating near-optimal space usage in biological sequence analysis.
Findings
Algorithm for k-Mismatch with error correction comparable to prior solutions.
Extension to weighted strings in biological sequences.
Space complexity is near-optimal up to polylog factors.
Abstract
We present a new streaming algorithm for the -Mismatch problem, one of the most basic problems in pattern matching. Given a pattern and a text, the task is to find all substrings of the text that are at the Hamming distance at most from the pattern. Our algorithm is enhanced with an important new feature called Error Correcting, and its complexities for and for a general are comparable to those of the solutions for the -Mismatch problem by Porat and Porat (FOCS 2009) and Clifford et al. (SODA 2016). In parallel to our research, a yet more efficient algorithm for the -Mismatch problem with the Error Correcting feature was developed by Clifford et al. (SODA 2019). Using the new feature and recent work on streaming Multiple Pattern Matching we develop a series of streaming algorithms for pattern matching on weighted strings, which are a commonly used representation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Genome Rearrangement Algorithms · Machine Learning and Algorithms
