Exploration-Exploitation Tradeoff in Universal Lossy Compression
Nir Weinberger, Ram Zamir

TL;DR
This paper models the exploration-exploitation tradeoff in universal lossy compression as a multi-armed bandit problem, analyzing existing schemes and proposing robust algorithms effective at any block length.
Contribution
It recasts sequential lossy compression as a bandit problem and introduces robust cost-directed algorithms, addressing limitations of previous methods.
Findings
Natural type selection can be viewed as a reconstruction-directed MAB algorithm
Existing schemes have robustness and short-block performance issues
Proposed algorithms are effective at any block length
Abstract
Universal compression can learn the source and adapt to it either in a batch mode (forward adaptation), or in a sequential mode (backward adaptation). We recast the sequential mode as a multi-armed bandit problem, a fundamental model in reinforcement-learning, and study the trade-off between exploration and exploitation in the lossy compression case. We show that a previously proposed "natural type selection" scheme can be cast as a reconstruction-directed MAB algorithm, for sequential lossy compression, and explain its limitations in terms of robustness and short-block performance. We then derive and analyze robust cost-directed MAB algorithms, which work at any block length.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Advanced Data Storage Technologies · Simulation Techniques and Applications
