A Parallel Two-Pass MDL Context Tree Algorithm for Universal Source Coding
Nikhil Krishnan, Dror Baron, Mehmet K{\i}van\c{c} M{\i}h\c{c}ak

TL;DR
This paper introduces a parallel two-pass MDL context tree algorithm for lossless universal source coding, achieving high throughput with minimal loss in compression quality by estimating the source first and then encoding in parallel.
Contribution
The novel two-pass approach combines parallel processing with MDL source estimation, maintaining compression quality while significantly increasing throughput.
Findings
Work-efficient with $O(N/B)$ complexity
Redundancy of approximately $B\log(N/B)$ bits
Effective for sources with depth up to $\log(N/B)$
Abstract
We present a novel lossless universal source coding algorithm that uses parallel computational units to increase the throughput. The length- input sequence is partitioned into blocks. Processing each block independently of the other blocks can accelerate the computation by a factor of , but degrades the compression quality. Instead, our approach is to first estimate the minimum description length (MDL) source underlying the entire input, and then encode each of the blocks in parallel based on the MDL source. With this two-pass approach, the compression loss incurred by using more parallel units is insignificant. Our algorithm is work-efficient, i.e., its computational complexity is . Its redundancy is approximately bits above Rissanen's lower bound on universal coding performance, with respect to any tree source whose maximal depth is at most…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Cellular Automata and Applications · Error Correcting Code Techniques
