Efficient and Compact Representations of Prefix Codes

Travis Gagie; Gonzalo Navarro; Yakov Nekrich; Alberto Ord\'o\~nez

arXiv:1410.3438·cs.DS·June 30, 2015

Efficient and Compact Representations of Prefix Codes

Travis Gagie, Gonzalo Navarro, Yakov Nekrich, Alberto Ord\'o\~nez

PDF

TL;DR

This paper presents new methods for efficiently storing prefix codes with significantly reduced space requirements and comparable encoding/decoding speeds, including approximate techniques that balance space, speed, and code optimality.

Contribution

The authors introduce novel data structures for prefix code storage that reduce space from O(n log n) to near-linear in n, with efficient encoding/decoding and approximation options.

Findings

01

Achieved 6-8 fold space reduction compared to state-of-the-art methods.

02

Encoding and decoding times are increased by factors of 2.5-24, depending on the technique.

03

Approximate methods can recover classical speeds with moderate code length penalties.

Abstract

Most of the attention in statistical compression is given to the space used by the compressed sequence, a problem completely solved with optimal prefix codes. However, in many applications, the storage space used to represent the prefix code itself can be an issue. In this paper we introduce and compare several techniques to store prefix codes. Let $N$ be the sequence length and $n$ be the alphabet size. Then a naive storage of an optimal prefix code uses $O (n lo g n)$ bits. Our first technique shows how to use $O (n lo g lo g (N / n))$ bits to store the optimal prefix code. Then we introduce an approximate technique that, for any $0 < ϵ < 1/2$ , takes $O (n lo g lo g (1/ ϵ))$ bits to store a prefix code with average codeword length within an additive $ϵ$ of the minimum. Finally, a second approximation takes, for any constant $c > 1$ , $O (n^{1/ c} lo g n)$ bits to store a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.