Faster run-length compressed suffix arrays
Nathaniel K. Brown, Travis Gagie, Giovanni Manzini, Gonzalo Navarro, and Marinella Sciortino

TL;DR
This paper introduces a faster method for computing suffix array intervals in run-length compressed suffix arrays, significantly improving query times without increasing space, with potential applications in indexing and pattern matching.
Contribution
The authors develop a novel modification of RLCSA that reduces SA interval query time to O(log r_a) while maintaining the same space complexity, by leveraging select queries on sparse bitvectors.
Findings
Achieved faster SA interval computation in RLCSA
Maintained asymptotic space bounds despite speed improvements
Potential enhancements for indexing and pattern matching algorithms
Abstract
We first review how we can store a run-length compressed suffix array (RLCSA) for a text of length over an alphabet of size whose Burrows-Wheeler Transform (BWT) consists of runs in bits such that later, given character and the suffix array interval for , we can find the suffix-array (SA) interval for in time, where is the number of runs of copies of in the BWT. We then show how to modify the RLCSA such that we find the SA interval for in only time, without increasing its asymptotic space bound. Our key idea is applying a result by Nishimoto and Tabei (ICALP 2021) and then replacing rank queries on sparse bitvectors by a constant number of select queries. We also review two-level indexing and discuss how our faster RLCSA…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries
