Small-space encoding LCE data structure with constant-time queries

Yuka Tanimura; Takaaki Nishimoto; Hideo Bannai; Shunsuke Inenaga,; Masayuki Takeda

arXiv:1702.07458·cs.DS·February 27, 2017·5 cites

Small-space encoding LCE data structure with constant-time queries

Yuka Tanimura, Takaaki Nishimoto, Hideo Bannai, Shunsuke Inenaga,, Masayuki Takeda

PDF

Open Access

TL;DR

This paper introduces a space-efficient, constant-time query data structure for the LCE problem that does not require access to the original string, with applications to highly repetitive and compressible strings.

Contribution

The paper presents a novel encoding LCE data structure with optimal query time and sub-linear space, surpassing existing lower bounds in certain scenarios.

Findings

01

Answers LCE queries in O(1) time with small space

02

Applicable to highly repetitive strings with sub-linear space

03

Works for strings with limited compressibility and small alphabet size

Abstract

The \emph{longest common extension} (\emph{LCE}) problem is to preprocess a given string $w$ of length $n$ so that the length of the longest common prefix between suffixes of $w$ that start at any two given positions is answered quickly. In this paper, we present a data structure of $O (z τ^{2} + \frac{n}{τ})$ words of space which answers LCE queries in $O (1)$ time and can be built in $O (n lo g σ)$ time, where $1 \leq τ \leq n$ is a parameter, $z$ is the size of the Lempel-Ziv 77 factorization of $w$ and $σ$ is the alphabet size. This is an \emph{encoding} data structure, i.e., it does not access the input string $w$ when answering queries and thus $w$ can be deleted after preprocessing. On top of this main result, we obtain further results using (variants of) our LCE data structure, which include the following: - For highly repetitive strings where the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · Network Packet Processing and Optimization · semigroups and automata theory