Deterministic Indexing for Packed Strings
Philip Bille, Inge Li G{\o}rtz, Frederik Rye Skjoldjensen

TL;DR
This paper introduces a deterministic, space-efficient string index for packed strings that enables faster pattern matching queries by leveraging multiple characters stored in single machine words.
Contribution
It presents the first deterministic index for packed strings with optimal preprocessing and improved query times based on character packing density.
Findings
Preprocessing time and space are both O(n).
Query time improves with increased character packing in words.
The index matches or surpasses previous bounds, especially with dense packing.
Abstract
Given a string of length , the classic string indexing problem is to preprocess into a compact data structure that supports efficient subsequent pattern queries. In the \emph{deterministic} variant the goal is to solve the string indexing problem without any randomization (at preprocessing time or query time). In the \emph{packed} variant the strings are stored with several character in a single word, giving us the opportunity to read multiple characters simultaneously. Our main result is a new string index in the deterministic \emph{and} packed setting. Given a packed string of length over an alphabet , we show how to preprocess in (deterministic) time and space such that given a packed pattern string of length we can support queries in (deterministic) time where $\alpha = w / \log…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · DNA and Biological Computing · semigroups and automata theory
