Faster Compact On-Line Lempel-Ziv Factorization
Jun'ichi Yamamoto, Tomohiro I, Hideo Bannai, Shunsuke Inenaga,, Masayuki Takeda

TL;DR
This paper introduces a faster on-line algorithm for Lempel-Ziv factorization that improves time complexity to O(N log N) using DAWGs, and offers an efficient variant for run-length compressed strings.
Contribution
The paper presents a novel on-line Lempel-Ziv factorization algorithm utilizing Directed Acyclic Word Graphs, achieving significant improvements in time and space efficiency over prior methods.
Findings
Achieves O(N log N) time complexity with low space usage.
Introduces an opportunistic variant for run-length compressed strings.
Demonstrates faster and more space-efficient factorization for compressible data.
Abstract
We present a new on-line algorithm for computing the Lempel-Ziv factorization of a string that runs in time and uses only bits of working space, where is the length of the string and is the size of the alphabet. This is a notable improvement compared to the performance of previous on-line algorithms using the same order of working space but running in either time (Okanohara & Sadakane 2009) or time (Starikovskaya 2012). The key to our new algorithm is in the utilization of an elegant but less popular index structure called Directed Acyclic Word Graphs, or DAWGs (Blumer et al. 1985). We also present an opportunistic variant of our algorithm, which, given the run length encoding of size of a string of length , computes the Lempel-Ziv factorization on-line, in $O\left(m \cdot \min \left\{\frac{(\log\log m)(\log…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Advanced Data Compression Techniques · Advanced Wireless Communication Techniques
