Data Structure Lower Bounds on Random Access to Grammar-Compressed Strings
Shiteng Chen, Elad Verbin, Wei Yu

TL;DR
This paper establishes fundamental lower bounds on the query time for static data structures representing grammar-compressed strings, demonstrating inherent limitations in achieving fast access with minimal space.
Contribution
It provides the first non-trivial lower bounds for random access in grammar-compressed strings, extending to various compression methods and models.
Findings
Lower bound of ^{1/2-\u03b5} for query time with poly(n) space
No significantly better lower bounds than n^{1/2-} in the cell-probe model
Lower bounds also apply to LZ and BWT compression schemes
Abstract
In this paper we investigate the problem of building a static data structure that represents a string s using space close to its compressed size, and allows fast access to individual characters of s. This type of structures was investigated by the recent paper of Bille et al. Let n be the size of a context-free grammar that derives a unique string s of length L. (Note that L might be exponential in n.) Bille et al. showed a data structure that uses space O(n) and allows to query for the i-th character of s using running time O(log L). Their data structure works on a word RAM with a word size of logL bits. Here we prove that for such data structures, if the space is poly(n), then the query time must be at least (log L)^{1-\epsilon}/log S where S is the space used, for any constant eps>0. As a function of n, our lower bound is \Omega(n^{1/2-\epsilon}). Our proof holds in the cell-probe…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · semigroups and automata theory · DNA and Biological Computing
