Optimal Lower and Upper Bounds for Representing Sequences

Djamal Belazzougui; Gonzalo Navarro

arXiv:1111.2621·cs.DS·August 26, 2013

Optimal Lower and Upper Bounds for Representing Sequences

Djamal Belazzougui, Gonzalo Navarro

PDF

Open Access

TL;DR

This paper establishes tight lower and upper bounds for sequence representations supporting access, select, and rank queries, demonstrating optimal space and time trade-offs, especially for large alphabets.

Contribution

It provides the first strong lower bound for rank operations and matching upper bounds in compressed space, improving the understanding of sequence data structure limits.

Findings

01

Proves a strong lower bound for rank operations.

02

Provides matching upper bounds in compressed space.

03

Achieves near-optimal constant-time access and select operations.

Abstract

Sequence representations supporting queries $a ccess$ , $se l ec t$ and $r ank$ are at the core of many data structures. There is a considerable gap between the various upper bounds and the few lower bounds known for such representations, and how they relate to the space used. In this article we prove a strong lower bound for $r ank$ , which holds for rather permissive assumptions on the space used, and give matching upper bounds that require only a compressed representation of the sequence. Within this compressed space, operations $a ccess$ and $se l ec t$ can be solved in constant or almost-constant time, which is optimal for large alphabets. Our new upper bounds dominate all of the previous work in the time/space map.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · DNA and Biological Computing · Error Correcting Code Techniques