Enumerative Data Compression with Non-Uniquely Decodable Codes
M. O\u{g}uzhan K\"ulekci, Yasin \"Ozt\"urk, Elif Altunok, Can, Alt{\i}ni\u{g}ne

TL;DR
This paper introduces a block-wise enumeration scheme for non-uniquely decodable codes, significantly improving compression ratios and potentially offering intrinsic security benefits over traditional prefix-free codes.
Contribution
The study proposes a novel block-wise enumeration method that enhances compression efficiency of non-uniquely decodable codes beyond previous approaches.
Findings
Successfully represents source within its entropy
Outperforms Huffman and arithmetic coding in some cases
Provides potential intrinsic security features
Abstract
Non-uniquely decodable codes can be defined as the codes that cannot be uniquely decoded without additional disambiguation information. These are mainly the class of non-prefix-free codes, where a codeword can be a prefix of other(s), and thus, the codeword boundary information is essential for correct decoding. Although the codeword bit stream consumes significantly less space when compared to prefix--free codes, the additional disambiguation information makes it difficult to catch the performance of prefix-free codes in total. Previous studies considered compression with non-prefix-free codes by integrating rank/select dictionaries or wavelet trees to mark the code-word boundaries. In this study we focus on another dimension with a block--wise enumeration scheme that improves the compression ratios of the previous studies significantly. Experiments conducted on a known corpus showed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Error Correcting Code Techniques · Coding theory and cryptography
