Results on the Redundancy of Universal Compression for Finite-Length Sequences
Ahmad Beirami, Faramarz Fekri

TL;DR
This paper analyzes the limits of universal compression for finite-length sequences from parametric sources, providing bounds, characterizations, and insights into redundancy behavior for different coding schemes.
Contribution
It derives bounds on the probability of low redundancy, characterizes minimax redundancy for two-stage codes, and assesses redundancy in small sequence compression.
Findings
Redundancy is significant for small sequences.
Two-stage codes incur negligible redundancy with many parameters.
Average minimax redundancy estimates typical source redundancy.
Abstract
In this paper, we investigate the redundancy of universal coding schemes on smooth parametric sources in the finite-length regime. We derive an upper bound on the probability of the event that a sequence of length , chosen using Jeffreys' prior from the family of parametric sources with unknown parameters, is compressed with a redundancy smaller than for any . Our results also confirm that for large enough and , the average minimax redundancy provides a good estimate for the redundancy of most sources. Our result may be used to evaluate the performance of universal source coding schemes on finite-length sequences. Additionally, we precisely characterize the minimax redundancy for two--stage codes. We demonstrate that the two--stage assumption incurs a negligible redundancy especially when the number of source parameters is large.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · DNA and Biological Computing · Cellular Automata and Applications
