LZ78 Substring Compression in Compressed Space
Hiroki Shibata, Dominik K\"oppl

TL;DR
This paper explores indexing techniques for LZ78-based data compression, enabling fast substring factorization in compressed space with a logarithmic slowdown, which enhances data processing efficiency.
Contribution
It introduces an algorithm for LZ78 factorization in the substring compression model that operates in compressed space with logarithmic time complexity overhead.
Findings
Efficient indexing of compressed data for LZ78 factorization.
Algorithm achieves logarithmic slowdown in factorization time.
Supports fast substring queries in compressed space.
Abstract
The Lempel--Ziv 78 (LZ78) factorization is a well-studied technique for data compression. It and its derivatives are used in compression formats such as "compress" or "gif". Although most research focuses on the factorization of plain data, not much research has been conducted on indexing the data for fast LZ78 factorization. Here, we study the LZ78 factorization and its derivatives in the substring compression model, where we are allowed to index the data and return the factorization of a substring specified at query time. In that model, we propose an algorithm that works in compressed space, computing the factorization with a logarithmic slowdown compared to the optimal time complexity.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Data Management and Algorithms · Advanced Database Systems and Queries
