Compressed Indexing with Signature Grammars
Anders Roy Christiansen, Mikko Berggren Ettienne

TL;DR
This paper introduces new compressed indexing data structures that efficiently support pattern matching queries by leveraging signature grammars and LZ77 parsing, significantly improving previous solutions in space and time complexity.
Contribution
It presents the first data structure capable of deciding pattern occurrence in O(m) time with compressed space, combining randomized grammar construction with 2D-range reporting techniques.
Findings
Supports pattern matching in O(m + occ (lg lg n + lg^ε z)) time
Uses O(z lg(n/z)) space, where z is the LZ77 parse size
Improves upon previous solutions in both space and time complexity
Abstract
The compressed indexing problem is to preprocess a string of length into a compressed representation that supports pattern matching queries. That is, given a string of length report all occurrences of in . We present a data structure that supports pattern matching queries in time using space where is the size of the LZ77 parse of and is an arbitrarily small constant, when the alphabet is small or for any constant . We also present two data structures for the general case; one where the space is increased by , and one where the query time changes from worst-case to expected. These results improve the previously best known solutions. Notably, this is the first data structure that decides if occurs in in time using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
