Fast and Compact Prefix Codes
Travis Gagie, Gonzalo Navarro, Yakov Nekrich

TL;DR
This paper introduces methods to store prefix codes efficiently, using significantly less space than traditional approaches, while maintaining constant-time encoding and decoding, for near-optimal or bounded expected codeword lengths.
Contribution
It presents novel data structures that store prefix codes in sublinear space with constant-time operations, achieving near-optimal expected codeword lengths.
Findings
Storage size is reduced to O(n log log(1/ε)) bits for ε-close codes.
Storage size is O(n^{1/c} log n) bits for codes within c times the minimum length.
Encoding and decoding operations run in O(1) time for all characters.
Abstract
It is well-known that, given a probability distribution over characters, in the worst case it takes (\Theta (n \log n)) bits to store a prefix code with minimum expected codeword length. However, in this paper we first show that, for any with (1 / \epsilon = \Oh{\polylog{n}}), it takes bits to store a prefix code with expected codeword length within of the minimum. We then show that, for any constant (c > 1), it takes bits to store a prefix code with expected codeword length at most times the minimum. In both cases, our data structures allow us to encode and decode any character in time.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · DNA and Biological Computing · semigroups and automata theory
