Seq2Seq2Seq: Lossless Data Compression via Discrete Latent Transformers and Reinforcement Learning
Mahdi Khodabandeh, Ghazal Shabani, Arash Yousefi Jordehi, Seyed Abolghasem Mirroshandel

TL;DR
This paper introduces a novel lossless data compression method using discrete latent transformers and reinforcement learning, which preserves token structure and improves compression ratios over traditional techniques.
Contribution
It proposes a reinforcement learning-based approach applied to a T5 model to enable token-sequence compression, maintaining data structure and enhancing efficiency.
Findings
Achieves higher compression ratios than conventional methods.
Preserves token-based data structure for better data integrity.
Effective without external grammatical or world knowledge.
Abstract
Efficient lossless compression is essential for minimizing storage costs and transmission overhead while preserving data integrity. Traditional compression techniques, such as dictionary-based and statistical methods, often struggle to optimally exploit the structure and redundancy in complex data formats. Recent advancements in deep learning have opened new avenues for compression; however, many existing approaches depend on dense vector representations that obscure the underlying token structure. To address these limitations, we propose a novel lossless compression method that leverages Reinforcement Learning applied to a T5 language model architecture. This approach enables the compression of data into sequences of tokens rather than traditional vector representations. Unlike auto-encoders, which typically encode information into continuous latent spaces, our method preserves the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Advanced Data Storage Technologies · Advanced Data Compression Techniques
