Faster subsequence recognition in compressed strings
Alexander Tiskin

TL;DR
This paper presents an improved algorithm for local subsequence recognition in compressed strings, reducing the time complexity from quadratic to near-linear in the pattern length, enabling more efficient processing of massive data sets.
Contribution
The authors develop a faster algorithm for subsequence recognition on SLP-compressed strings, improving the time complexity from O(๐ฬ n^2 log n) to O(๐ฬ n^{1.5}), and extend it to compute longest common subsequences.
Findings
Algorithm runs in O(๐ฬ n^{1.5}) time for subsequence recognition.
Extension to longest common subsequence computation in similar time.
Improves efficiency for processing large compressed data sets.
Abstract
Computation on compressed strings is one of the key approaches to processing massive data sets. We consider local subsequence recognition problems on strings compressed by straight-line programs (SLP), which is closely related to Lempel--Ziv compression. For an SLP-compressed text of length , and an uncompressed pattern of length , C{\'e}gielski et al. gave an algorithm for local subsequence recognition running in time . We improve the running time to . Our algorithm can also be used to compute the longest common subsequence between a compressed text and an uncompressed pattern in time ; the same problem with a compressed pattern is known to be NP-hard.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression ยท Network Packet Processing and Optimization ยท Parallel Computing and Optimization Techniques
