TL;DR
StateSMix is a novel lossless compression system combining an online-trained neural state space model with sparse n-gram context mixing, achieving superior compression on enwik8 without pre-training or external dependencies.
Contribution
It introduces a fully self-contained, online-trained neural compression method that integrates neural and n-gram models with adaptive scaling, outperforming traditional compressors.
Findings
StateSMix achieves 2.123 bits per byte on enwik8, surpassing xz -9e by up to 8.7%.
The neural state space model accounts for 46.6% of size reduction over frequency-count baselines.
N-gram tables provide an additional 4.1% gain through exact context memorization.
Abstract
We present StateSMix, a fully self-contained lossless compressor that couples an online-trained Mamba-style State Space Model (SSM) with sparse n-gram context mixing and arithmetic coding. The model is initialised from scratch and trained token-by-token on the file being compressed, requiring no pre-trained weights, no GPU, and no external dependencies. The SSM (DM=32, NL=2, approximately 120K active parameters per file) provides a continuously-updated probability estimate over BPE tokens, while nine sparse n-gram hash tables (bigram through 32-gram, 16M slots each) add exact local and long-range pattern memorisation via a softmax-invariant logit-bias mechanism that updates only non-zero-count tokens. An entropy-adaptive scaling mechanism modulates the n-gram contribution based on the SSM's predictive confidence, preventing over-correction when the neural model is already…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
