Binary Jumbled String Matching for Highly Run-Length Compressible Texts
Golnaz Badkobeh, Gabriele Fici, Steve Kroon, Zsuzsanna Lipt\'ak

TL;DR
This paper introduces a run-length encoding-based index for binary jumbled string matching that improves construction time and query efficiency for texts with high run-length compressibility, simplifying implementation.
Contribution
It proposes a novel index construction method leveraging run-length encoding, reducing complexity and potentially decreasing index size for highly compressible texts.
Findings
Index construction time is $O(n+ ho^2\log ho)$.
Query time is $O(\log ho)$.
Index size often close to run-length encoding length.
Abstract
The Binary Jumbled String Matching problem is defined as: Given a string over of length and a query , with non-negative integers, decide whether has a substring with exactly 's and 's. Previous solutions created an index of size O(n) in a pre-processing step, which was then used to answer queries in constant time. The fastest algorithms for construction of this index have running time [Burcsi et al., FUN 2010; Moosa and Rahman, IPL 2010], or in the word-RAM model [Moosa and Rahman, JDA 2012]. We propose an index constructed directly from the run-length encoding of . The construction time of our index is , where O(n) is the time for computing the run-length encoding of and is the length of this encoding---this is no worse than previous solutions if $\rho = O(n/\log…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
