Longest Common Extensions with Recompression

Tomohiro I

arXiv:1611.05359·cs.DS·November 22, 2016·5 cites

Longest Common Extensions with Recompression

Tomohiro I

PDF

Open Access

TL;DR

This paper introduces a new compressed data structure for efficient longest common extension queries in strings, leveraging recompression techniques to achieve optimal size and query time, applicable to both uncompressed and grammar-compressed inputs.

Contribution

It presents a novel compressed LCE data structure of size O(z log(N/z)) supporting O(log N) query time, constructed efficiently from both uncompressed and grammar-compressed strings using recompression.

Findings

01

Supports fast LCE queries in compressed space

02

Constructed efficiently in linear or near-linear time

03

Applicable to uncompressed and grammar-compressed strings

Abstract

Given two positions $i$ and $j$ in a string $T$ of length $N$ , a longest common extension (LCE) query asks for the length of the longest common prefix between suffixes beginning at $i$ and $j$ . A compressed LCE data structure is a data structure that stores $T$ in a compressed form while supporting fast LCE queries. In this article we show that the recompression technique is a powerful tool for compressed LCE data structures. We present a new compressed LCE data structure of size $O (z l g (N / z))$ that supports LCE queries in $O (l g N)$ time, where $z$ is the size of Lempel-Ziv 77 factorization without self-reference of $T$ . Given $T$ as an uncompressed form, we show how to build our data structure in $O (N)$ time and space. Given $T$ as a grammar compressed form, i.e., an straight-line program of size n generating $T$ , we show how to build our data structure in $O (n l g (N / n))$ time and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAlgorithms and Data Compression · Parallel Computing and Optimization Techniques · semigroups and automata theory