In-Place Sparse Suffix Sorting
Nicola Prezza

TL;DR
This paper introduces the first in-place algorithms for constructing sparse suffix arrays and LCP arrays efficiently, enabling space-saving suffix sorting and selection in large texts.
Contribution
It presents the first in-place algorithms for building sparse suffix arrays, LCP arrays, and suffix selection, improving space efficiency for large-scale text processing.
Findings
First in-place LCP array construction in O(n log n) expected time.
First Monte Carlo in-place algorithms for SSA and SLCP in O(n + b log^2 n) expected time.
First in-place solution for suffix selection problem.
Abstract
Suffix arrays encode the lexicographical order of all suffixes of a text and are often combined with the Longest Common Prefix array (LCP) to simulate navigational queries on the suffix tree in reduced space. In space-critical applications such as sparse and compressed text indexing, only information regarding the lexicographical order of a size- subset of all text suffixes is often needed. Such information can be stored space-efficiently (in words) in the sparse suffix array (SSA). The SSA and its relative sparse LCP array (SLCP) can be used as a space-efficient substitute of the sparse suffix tree. Very recently, Gawrychowski and Kociumaka [SODA 2017] showed that the sparse suffix tree (and therefore SSA and SLCP) can be built in asymptotically optimal space with a Monte Carlo algorithm running in time. The main reason for using the SSA and SLCP arrays in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Advanced Image and Video Retrieval Techniques · Natural Language Processing Techniques
