Efficient Integer Retrieving from Unordered Compressed Sequences
Igor O. Zavadskyi

TL;DR
This paper introduces a method for nearly constant-time direct access to codewords in RMD-coded bitstreams, enhancing data compression efficiency with minimal space overhead, demonstrated on natural language text.
Contribution
It presents a novel technique for direct codeword retrieval from RMD-bitstreams with almost constant time and minimal space overhead.
Findings
Achieves near constant-time codeword access
Maintains good compression ratio in natural language text
Uses minimal additional space for indexing
Abstract
The variable-length Reverse Multi-Delimiter (RMD) codes are known to represent sequences of unbounded and unordered integers. When applied to data compression, they combine a good compression ratio with fast decoding. In this paper, we investigate another property of RMD-codes - the ability of direct access to codewords in the encoded bitstream. We present the method allowing us to extract and decode a codeword from an RMD-bitstream in almost constant time with the tiny space overhead, and make experiments on its application to natural language text compression.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Error Correcting Code Techniques · DNA and Biological Computing
