Linear pattern matching on sparse suffix trees
Roman Kolpakov, Gregory Kucherov, Tatiana Starikovskaya

TL;DR
This paper introduces a space-efficient index for packed strings based on sparse suffix trees, enabling faster pattern matching by exploiting character packing within computer words.
Contribution
It proposes a novel index structure for packed strings using sparse suffix trees with suffix links, achieving optimal space and improved pattern matching performance.
Findings
Index uses O(n/ log_sigma n) space, matching packed string size.
Pattern matching runs in O(m + r^2 + r * occ) time, with r characters per word.
Efficiently exploits character packing for faster string processing.
Abstract
Packing several characters into one computer word is a simple and natural way to compress the representation of a string and to speed up its processing. Exploiting this idea, we propose an index for a packed string, based on a {\em sparse suffix tree} \cite{KU-96} with appropriately defined suffix links. Assuming, under the standard unit-cost RAM model, that a word can store up to characters ( the alphabet size), our index takes space, i.e. the same space as the packed string itself. The resulting pattern matching algorithm runs in time , where is the length of the pattern, is the actual number of characters stored in a word and is the number of pattern occurrences.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
