Compressing combinatorial objects

Christian Steinruecken

arXiv:1601.03689·cs.IT·January 15, 2016

Compressing combinatorial objects

Christian Steinruecken

PDF

TL;DR

This paper explores new compression techniques for non-sequential data types like permutations and multisets using arithmetic coding, providing probabilistic models and conditions for near-optimal compression.

Contribution

It introduces concrete compression methods for non-sequential data and derives reusable probabilistic models based on structural assumptions.

Findings

01

Near-optimal compression for permutations, combinations, and multisets

02

Explicit conditions for optimality of each method

03

Probabilistic models derived from structural assumptions

Abstract

Most of the world's digital data is currently encoded in a sequential form, and compression methods for sequences have been studied extensively. However, there are many types of non-sequential data for which good compression techniques are still largely unexplored. This paper contributes insights and concrete techniques for compressing various kinds of non-sequential data via arithmetic coding, and derives re-usable probabilistic data models from fairly generic structural assumptions. Near-optimal compression methods are described for certain types of permutations, combinations and multisets; and the conditions for optimality are made explicit for each method.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.