Float8@2bits: Entropy Coding Enables Data-Free Model Compression
Patrick Putzky, Martin Genzel, Mattes Mollenhauer, Sebastian Schulze, Thomas Wollmann, Stefan Dietzel

TL;DR
EntQuant introduces a novel entropy coding framework that unites data-free and data-dependent model compression, enabling fast, high-fidelity, extreme compression of large models with minimal inference overhead.
Contribution
It is the first method to decouple numerical precision from storage cost using entropy coding, achieving state-of-the-art compression performance in a practical, data-free manner.
Findings
Compresses 70B parameter models in under 30 minutes.
Achieves state-of-the-art results on standard benchmarks.
Retains functional performance on complex instruction-tuned models.
Abstract
Post-training compression is currently divided into two contrasting regimes. On the one hand, fast, data-free, and model-agnostic methods (e.g., NF4 or HQQ) offer maximum accessibility but suffer from functional collapse at extreme bit-rates below 4 bits. On the other hand, techniques leveraging calibration data or extensive recovery training achieve superior fidelity but impose high computational constraints and face uncertain robustness under data distribution shifts. We introduce EntQuant, the first framework to unite the advantages of these distinct paradigms. By matching the performance of data-dependent methods with the speed and universality of data-free techniques, EntQuant enables practical utility in the extreme compression regime. Our method decouples numerical precision from storage cost via entropy coding, compressing a 70B parameter model in less than 30 minutes. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Compression Techniques · Parallel Computing and Optimization Techniques · Advanced Data Storage Technologies
