Using statistical encoding to achieve tree succinctness never seen before
Micha{\l} Ga\'nczorz

TL;DR
This paper introduces a new, simpler succinct tree representation that achieves near-optimal entropy bounds with minimal additional space, outperforming previous XBWT-based methods in simplicity and efficiency.
Contribution
A novel tree encoding method based on a new partitioning technique that achieves |T|H_k(T) bits with sublinear additional space, improving over XBWT-based approaches.
Findings
Achieves |T|H_k(T) bits for tree representation.
Supports all navigational queries in constant time.
Reduces space redundancy compared to XBWT-based methods.
Abstract
We propose a new succinct representation of labeled trees which represents a tree T using |T|H_k(T) number of bits (plus some smaller order terms), where |T|H_k(T) denotes the k-th order (tree label) entropy, as defined by Ferragina at al. 2005. Our representation employs a new, simple method of partitioning the tree, which preserves both tree shape and node degrees. Previously, the only representation that used |T|H_k(T) bits was based on XBWT, a transformation that linearizes tree labels into a single string, combined with compression boosting. The proposed representation is much simpler than the one based on XBWT, which used additional linear space (bounded by 0.01n) hidden in the "smaller order terms" notion, as an artifact of using zeroth order entropy coder; our representation uses sublinear additional space (for reasonable values of k and size of the label alphabet {\sigma}). The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
