Low-Complexity Vector Source Coding for Discrete Long Sequences with Unknown Distributions
Leah Woldemariam, Hang Liu, Anna Scaglione

TL;DR
This paper introduces a low-complexity source coding method for long discrete sequences from unknown distributions, leveraging spatial structure and Golomb coding to approach entropy rates efficiently.
Contribution
It presents a novel encoding scheme that efficiently compresses data from unknown distributions without high computational complexity.
Findings
Achieves near-entropy compression rates
Uses Golomb coding for run-length encoding
Maintains low computational complexity
Abstract
In this paper, we propose a source coding scheme that represents data from unknown distributions through frequency and support information. Existing encoding schemes often compress data by sacrificing computational efficiency or by assuming the data follows a known distribution. We take advantage of the structure that arises within the spatial representation and utilize it to encode run-lengths within this representation using Golomb coding. Through theoretical analysis, we show that our scheme yields an overall bit rate that nears entropy without a computationally complex encoding algorithm and verify these results through numerical experiments.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Error Correcting Code Techniques · Cellular Automata and Applications
