Longest Common Extensions with Recompression
Tomohiro I

TL;DR
This paper introduces a new compressed data structure for efficient longest common extension queries in strings, leveraging recompression techniques to achieve optimal size and query time, applicable to both uncompressed and grammar-compressed inputs.
Contribution
It presents a novel compressed LCE data structure of size O(z log(N/z)) supporting O(log N) query time, constructed efficiently from both uncompressed and grammar-compressed strings using recompression.
Findings
Supports fast LCE queries in compressed space
Constructed efficiently in linear or near-linear time
Applicable to uncompressed and grammar-compressed strings
Abstract
Given two positions and in a string of length , a longest common extension (LCE) query asks for the length of the longest common prefix between suffixes beginning at and . A compressed LCE data structure is a data structure that stores in a compressed form while supporting fast LCE queries. In this article we show that the recompression technique is a powerful tool for compressed LCE data structures. We present a new compressed LCE data structure of size that supports LCE queries in time, where is the size of Lempel-Ziv 77 factorization without self-reference of . Given as an uncompressed form, we show how to build our data structure in time and space. Given as a grammar compressed form, i.e., an straight-line program of size n generating , we show how to build our data structure in time and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Parallel Computing and Optimization Techniques · semigroups and automata theory
