Compressed Index with Construction in Compressed Space
Dmitry Kosolobov

TL;DR
This paper introduces a space-efficient compressed index for strings that can be built quickly in a streaming fashion, matching the best search times and nearly optimal space bounds.
Contribution
It presents the first index with construction in compressed space and time, avoiding probabilistic methods and achieving near-optimal space and fast search.
Findings
Index size is $O( ext{delta} \, \log\frac{n}{\delta})$
Construction time is $O(n \log n)$ expected in a streaming pass
Search time is $O(m + (occ+1) \log^\epsilon n)$, matching best known results
Abstract
Suppose that we are given a string of length over an alphabet and is the string complexity of , a known compression measure. We describe an index on with space, measured in -bit machine words, which can search in any string of length in time, where is the number of occurrences and is any fixed constant (the big-O in the space bound hides factor ). Crucially, the index can be built in expected time by one left-to-right pass on the string in a streaming fashion with construction space. The index does not use the Karp--Rabin fingerprints, and the randomization in the construction time can be eliminated by using deterministic dictionaries instead of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
