Space-efficient SLP encoding for $O(\log N)$-time random access
Akito Takasaka, Tomohiro I

TL;DR
This paper introduces space-efficient encodings of SLPs that enable near-optimal $O(rac{ ext{log} N}{q-p})$ time random access to substrings, improving string compression and retrieval.
Contribution
It presents novel encoding schemes for SLPs that support fast random access with near-optimal space complexity, extending previous work on compressed string representations.
Findings
Supports substring extraction in $O( ext{log} N + q - p)$ time.
Uses space close to the information-theoretic lower bound.
Provides multiple encoding variants with different space-time trade-offs.
Abstract
A Straight-Line Program (SLP) for a string is a context-free grammar (CFG) that derives only, which can be considered as a compressed representation of . In this paper, we show how to encode in bits to support random access queries of extracting in worst-case time, where is the length of , is the alphabet size, is the number of variables in and is the number of symmetric centroid paths in the DAG representation for . The time complexity is almost optimal because Verbin and Yu [CPM 2013] proved that term cannot be significantly improved in general with -space data structures. We also present alternative encodings that achieve the same random access time with $n \lceil \lg N \rceil + n \lceil…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · DNA and Biological Computing · Error Correcting Code Techniques
