Relations Between Greedy and Bit-Optimal LZ77 Encodings
Dmitry Kosolobov

TL;DR
This paper analyzes the relationship between greedy and optimal LZ77 encodings, establishing tight bounds on their size difference for various alphabet sizes and providing new theoretical insights into data compression efficiency.
Contribution
It proves tight bounds on the size ratio between greedy and optimal LZ77 encodings for constant and non-constant alphabets, improving previous bounds and offering detailed analysis.
Findings
Greedy LZ77 encoding is within a factor of $O(rac{ ext{log} n}{ ext{log log log n}})$ of optimal on constant alphabets.
The bound $O( ext{min}igrace{z, rac{ ext{log} n}{ ext{log log z}}igrace}$ is tight for binary alphabets.
For non-constant alphabets, the $O( ext{log} n)$ bound is tight even for logarithmic alphabet sizes.
Abstract
This paper investigates the size in bits of the LZ77 encoding, which is the most popular and efficient variant of the Lempel-Ziv encodings used in data compression. We prove that, for a wide natural class of variable-length encoders for LZ77 phrases, the size of the greedily constructed LZ77 encoding on constant alphabets is within a factor of the optimal LZ77 encoding, where is the length of the processed string. We describe a series of examples showing that, surprisingly, this bound is tight, thus improving both the previously known upper and lower bounds. Further, we obtain a more detailed bound , which uses the number of phrases in the greedy LZ77 encoding as a parameter, and construct a series of examples showing that this bound is tight even for binary alphabet. We then investigate the problem on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · semigroups and automata theory · Coding theory and cryptography
