Tail redundancy and its characterization of compression of memoryless sources
Maryam Hosseini, Narayana Santhanam

TL;DR
This paper introduces the concept of tail redundancy to characterize the asymptotic per-symbol redundancy in universal compression of iid sequences over countably infinite alphabets, emphasizing the role of tail behavior.
Contribution
It formalizes tail redundancy and demonstrates its fundamental role in determining the asymptotic redundancy in universal compression for infinite alphabet sources.
Findings
Tail redundancy characterizes asymptotic per-symbol redundancy.
Finite average redundancy does not guarantee sublinear growth of total redundancy.
Universal compression performance depends on tail description quality.
Abstract
We formalize the tail redundancy of a collection of distributions over a countably infinite alphabet, and show that this fundamental quantity characterizes the asymptotic per-symbol redundancy of universally compressing sequences generated iid from a collection of distributions over a countably infinite alphabet. Contrary to the worst case formulations of universal compression, finite single letter (average case) redundancy of does not automatically imply that the expected redundancy of describing length- strings sampled iid from grows sublinearly with . Instead, we prove that universal compression of length- \iid sequences from is characterized by how well the tails of distributions in can be universally described, showing that the asymptotic per-symbol redundancy of iid strings is equal to the tail redundancy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Computability, Logic, AI Algorithms · Cellular Automata and Applications
