Rank, select and access in grammar-compressed strings
Djamal Belazzougui, Simon J. Puglisi, Yasuo Tabei

TL;DR
This paper develops data structures for efficient rank, select, and access operations on grammar-compressed strings, achieving near-optimal space and time bounds and establishing complexity lower bounds.
Contribution
It introduces novel data structures for rank, select, and access in grammar-compressed strings, with tight bounds and hardness results for improvements.
Findings
Support rank and select in $O( olinebreak ext{log} N)$ time with $O(n ext{log} N)$ bits.
Support rank and select in $O( ext{log} N / ext{log} ext{log} N)$ time with $O(n ext{log} (N/n) ( ext{log} N)^{1+ ext{epsilon}})$ bits.
Achieve access query time close to the lower bound with space proportional to the grammar size.
Abstract
Given a string of length on a fixed alphabet of symbols, a grammar compressor produces a context-free grammar of size that generates and only . In this paper we describe data structures to support the following operations on a grammar-compressed string: (return the number of occurrences of symbol before position in ); (return the position of the th occurrence of in ); and (return substring ). For rank and select we describe data structures of size bits that support the two operations in time. We propose another structure that uses bits and that supports the two queries in , where is an arbitrary constant. To our knowledge, we are the first to study the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · DNA and Biological Computing · semigroups and automata theory
