String Indexing with Compressed Patterns
Philip Bille, Inge Li G{\o}rtz, Teresa Anna Steiner

TL;DR
This paper introduces a linear space data structure for efficient pattern matching directly on compressed patterns using Lempel-Ziv, optimizing query time in string indexing scenarios.
Contribution
It presents a novel linear space data structure that processes compressed patterns directly, reducing query time in string indexing with LZ77 compression.
Findings
Achieves near-optimal query time for compressed pattern searches.
Develops a linear space data structure for all LZ77 compressed suffixes.
Introduces a trie decomposition technique reducing search time.
Abstract
Given a string of length , the classic string indexing problem is to preprocess into a compact data structure that supports efficient subsequent pattern queries. In this paper we consider the basic variant where the pattern is given in compressed form and the goal is to achieve query time that is fast in terms of the compressed size of the pattern. This captures the common client-server scenario, where a client submits a query and communicates it in compressed form to a server. Instead of the server decompressing the query before processing it, we consider how to efficiently process the compressed query directly. Our main result is a novel linear space data structure that achieves near-optimal query time for patterns compressed with the classic Lempel-Ziv compression scheme. Along the way we develop several data structural techniques of independent interest, including a novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Network Packet Processing and Optimization · DNA and Biological Computing
