Lossless and Near-Lossless Compression for Foundation Models
Moshik Hershcovitch, Leshem Choshen, Andrew Wood, Ilias Enmouri, Peter, Chin, Swaminathan Sundararaman, Danny Harnik

TL;DR
This paper explores lossless and near-lossless compression techniques for large models, achieving over 50% size reduction and significant network traffic savings, with minimal impact on accuracy.
Contribution
It introduces novel lossless compression methods for models, analyzes their sources of compressibility, and proposes a tunable lossy approach for further size reduction.
Findings
Lossless compression can reduce model sizes by over 50%.
Compression techniques can save over an ExaByte of network traffic monthly.
Near-lossless methods maintain model accuracy while reducing size.
Abstract
With the growth of model sizes and scale of their deployment, their sheer size burdens the infrastructure requiring more network and more storage to accommodate these. While there is a vast literature about reducing model sizes, we investigate a more traditional type of compression -- one that compresses the model to a smaller form and is coupled with a decompression algorithm that returns it to its original size -- namely lossless compression. Somewhat surprisingly, we show that such lossless compression can gain significant network and storage reduction on popular models, at times reducing over of the model size. We investigate the source of model compressibility, introduce compression variants tailored for models and categorize models to compressibility groups. We also introduce a tunable lossy compression technique that can further reduce size even on the less compressible…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputer Graphics and Visualization Techniques · Digital Filter Design and Implementation
