Tight Lower Bounds for Central String Queries in Compressed Space
Dominik Kempa, Tomasz Kociumaka

TL;DR
This paper establishes tight lower bounds for the time complexity of fundamental string queries in compressed data structures, showing a clear dichotomy between two optimal bounds and completing the theoretical understanding of compressed indexing.
Contribution
It provides the first tight lower bounds for nearly all central string queries in compressed space, matching known upper bounds and closing a key theoretical gap.
Findings
Suffix array, LCP, LCE queries require (log n/loglog n) time.
Other queries like BWT, , , , , , and inverse require (loglog n) time.
Lower bounds hold even for binary alphabet texts.
Abstract
In this work, we study the limits of compressed data structures, i.e., structures that support various queries on an input text using space proportional to the size of in compressed form. Nearly all fundamental queries can currently be efficiently supported in space, where is the substring complexity, a strong compressibility measure that lower-bounds the optimal space to represent the text [Kociumaka, Navarro, Prezza, IEEE Trans. Inf. Theory 2023]. However, optimal query time has been characterized only for random access. We address this gap by developing tight lower bounds for nearly all other fundamental queries: (1) We prove that suffix array (SA), inverse suffix array (SA), longest common prefix (LCP) array, and longest common extension (LCE) queries all require time within…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Complexity and Algorithms in Graphs · Data Management and Algorithms
