Sparse Suffix Tree Construction in Optimal Time and Space
Pawe{\l} Gawrychowski, Tomasz Kociumaka

TL;DR
This paper presents a linear-time Monte Carlo algorithm for constructing sparse suffix trees efficiently in space, solving an open problem and improving previous algorithms with a deterministic verification method.
Contribution
It introduces a novel linear-time Monte Carlo algorithm for sparse suffix tree construction with optimal space, resolving an open question and enhancing verification efficiency.
Findings
Achieves linear time complexity for sparse suffix tree construction.
Provides a space-efficient algorithm using only O(b) words of space.
Offers a faster deterministic verification procedure with O(n√log b) time.
Abstract
Suffix tree (and the closely related suffix array) are fundamental structures capturing all substrings of a given text essentially by storing all its suffixes in the lexicographical order. In some applications, we work with a subset of interesting suffixes, which are stored in the so-called sparse suffix tree. Because the size of this structure is , it is natural to seek a construction algorithm using only words of space assuming read-only random access to the text. We design a linear-time Monte Carlo algorithm for this problem, hence resolving an open question explicitly stated by Bille et al. [TALG 2016]. The best previously known algorithm by I et al. [STACS 2014] works in time. Our solution proceeds in rounds; in the -th round, we consider all suffixes starting at positions congruent to modulo . By maintaining rolling hashes, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Natural Language Processing Techniques · Advanced Image and Video Retrieval Techniques
