Dv2v: A Dynamic Variable-to-Variable Compressor
Nieves R. Brisaboa, Antonio Fari\~na, Adri\'an G\'omez-Brand\'on,, Gonzalo Navarro, Tirso V. Rodeiro

TL;DR
Dv2v is a real-time, dynamic variable-to-variable compressor that processes text word-wise, achieving superior compression ratios compared to previous semi-static methods and approaching p7zip performance.
Contribution
Introduces Dv2v, a novel dynamic, one-pass variable-to-variable compression algorithm that processes input in real-time and adapts to data frequency.
Findings
Outperforms v2vDC in compression ratio
Nearly matches p7zip compression performance
Maintains competitive speed in compression and decompression
Abstract
We present Dv2v, a new dynamic (one-pass) variable-to-variable compressor. Variable-to-variable compression aims at using a modeler that gathers variable-length input symbols and a variable-length statistical coder that assigns shorter codewords to the more frequent symbols. In Dv2v, we process the input text word-wise to gather variable-length symbols that can be either terminals (new words) or non-terminals, subsequences of words seen before in the input text. Those input symbols are set in a vocabulary that is kept sorted by frequency. Therefore, those symbols can be easily encoded with dense codes. Our Dv2v permits real-time transmission of data, i.e. compression/transmission can begin as soon as data become available. Our experiments show that Dv2v is able to overcome the compression ratios of the v2vDC, the state-of-the-art semi-static variable-to-variable compressor, and to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
