Dv2v: A Dynamic Variable-to-Variable Compressor

Nieves R. Brisaboa; Antonio Fari\~na; Adri\'an G\'omez-Brand\'on,; Gonzalo Navarro; Tirso V. Rodeiro

arXiv:1911.04202·cs.DS·November 12, 2019

Dv2v: A Dynamic Variable-to-Variable Compressor

Nieves R. Brisaboa, Antonio Fari\~na, Adri\'an G\'omez-Brand\'on,, Gonzalo Navarro, Tirso V. Rodeiro

PDF

TL;DR

Dv2v is a real-time, dynamic variable-to-variable compressor that processes text word-wise, achieving superior compression ratios compared to previous semi-static methods and approaching p7zip performance.

Contribution

Introduces Dv2v, a novel dynamic, one-pass variable-to-variable compression algorithm that processes input in real-time and adapts to data frequency.

Findings

01

Outperforms v2vDC in compression ratio

02

Nearly matches p7zip compression performance

03

Maintains competitive speed in compression and decompression

Abstract

We present Dv2v, a new dynamic (one-pass) variable-to-variable compressor. Variable-to-variable compression aims at using a modeler that gathers variable-length input symbols and a variable-length statistical coder that assigns shorter codewords to the more frequent symbols. In Dv2v, we process the input text word-wise to gather variable-length symbols that can be either terminals (new words) or non-terminals, subsequences of words seen before in the input text. Those input symbols are set in a vocabulary that is kept sorted by frequency. Therefore, those symbols can be easily encoded with dense codes. Our Dv2v permits real-time transmission of data, i.e. compression/transmission can begin as soon as data become available. Our experiments show that Dv2v is able to overcome the compression ratios of the v2vDC, the state-of-the-art semi-static variable-to-variable compressor, and to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.