LZD-style Compression Scheme with Truncation and Repetitions
Linus G\"otz, Dominik K\"oppl

TL;DR
This paper introduces enhanced LZD-based compression algorithms, LZD+ and LZDR, which achieve linear-time online compression and outperform existing methods in factorization efficiency on standard datasets.
Contribution
The paper presents LZD+ and LZDR, novel improvements to LZD that enable linear-time compression and better factorization, addressing previous inefficiencies and weaknesses.
Findings
LZD+ achieves expected linear-time compression.
LZDR introduces repetition-based factorization for improved compression.
Benchmarking shows superior factor reduction over existing LZ methods.
Abstract
Lempel-Ziv-Double (LZD) is a variation of the LZ78 compression scheme that achieves better compression on repetitive datasets. Nevertheless, prior research has identified computational inefficiencies and a weakness in its compressibility for certain datasets. In this paper, we introduce LZD+, an enhancement of LZD, which enables expected linear-time online compression by allowing truncated references. To avoid the compressibility weakness exhibited by a lower bound example, we propose LZDR (LZD-runlength compressed), a further enhancement on top of LZD+, which introduces a repetition-based factorization rule while maintaining linear expected time complexity. The both time bounds can be de-randomized by a lookup data structure like a balanced search tree with a logarithmic dependency on the alphabet size. Additionally, we present three flexible parsing variants of LZDR that yield fewer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Compression Techniques · Algorithms and Data Compression · Advanced Numerical Analysis Techniques
