A Universal Grammar-Based Code For Lossless Compression of Binary Trees
Jie Zhang, En-hui Yang, John C. Kieffer

TL;DR
This paper introduces a universal, grammar-based lossless compression method for binary trees that encodes trees into binary codewords through a two-step process involving grammar transformation and encoding, applicable to various probabilistic models.
Contribution
It presents a novel universal grammar-based coding scheme for binary trees, capable of efficiently compressing trees across different probabilistic models.
Findings
The code is universal for a family of probabilistic binary tree sources.
The encoding and decoding process is reversible and efficient.
The method reduces storage/transmission bits for binary trees.
Abstract
We consider the problem of lossless compression of binary trees, with the aim of reducing the number of code bits needed to store or transmit such trees. A lossless grammar-based code is presented which encodes each binary tree into a binary codeword in two steps. In the first step, the tree is transformed into a context-free grammar from which the tree can be reconstructed. In the second step, the context-free grammar is encoded into a binary codeword. The decoder of the grammar-based code decodes the original tree from its codeword by reversing the two encoding steps. It is shown that the resulting grammar-based binary tree compression code is a universal code on a family of probabilistic binary tree source models satisfying certain weak restrictions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · DNA and Biological Computing · Cellular Automata and Applications
