Using statistical encoding to achieve tree succinctness never seen   before

Micha{\l} Ga\'nczorz

arXiv:1807.06359·cs.DS·July 18, 2018

Using statistical encoding to achieve tree succinctness never seen before

Micha{\l} Ga\'nczorz

PDF

TL;DR

This paper introduces a new, simpler succinct tree representation that achieves near-optimal entropy bounds with minimal additional space, outperforming previous XBWT-based methods in simplicity and efficiency.

Contribution

A novel tree encoding method based on a new partitioning technique that achieves |T|H_k(T) bits with sublinear additional space, improving over XBWT-based approaches.

Findings

01

Achieves |T|H_k(T) bits for tree representation.

02

Supports all navigational queries in constant time.

03

Reduces space redundancy compared to XBWT-based methods.

Abstract

We propose a new succinct representation of labeled trees which represents a tree T using |T|H_k(T) number of bits (plus some smaller order terms), where |T|H_k(T) denotes the k-th order (tree label) entropy, as defined by Ferragina at al. 2005. Our representation employs a new, simple method of partitioning the tree, which preserves both tree shape and node degrees. Previously, the only representation that used |T|H_k(T) bits was based on XBWT, a transformation that linearizes tree labels into a single string, combined with compression boosting. The proposed representation is much simpler than the one based on XBWT, which used additional linear space (bounded by 0.01n) hidden in the "smaller order terms" notion, as an artifact of using zeroth order entropy coder; our representation uses sublinear additional space (for reasonable values of k and size of the label alphabet {\sigma}). The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.