Hyper-Compression: Model Compression via Hyperfunction
Fenglei Fan, Juntong Fan, Dayang Wang, Jingbo Zhang, Zelin Dong, Shijun Zhang, Ge Wang, Tieyong Zeng

TL;DR
This paper introduces Hyper-Compression, a novel model compression method using hyperfunctions based on dynamic systems, achieving high compression ratios without retraining and maintaining near-quantization performance.
Contribution
It proposes a new model compression approach via hyperfunctions derived from dynamic systems, differing from traditional pruning and quantization, with theoretical error bounds and practical effectiveness.
Findings
Compresses LLaMA2-7B in an hour with less than 1% performance drop
Achieves close-to-int4-quantization performance without retraining
Offers a preferable compression ratio and efficient inference
Abstract
The rapid growth of large models' size has far outpaced that of computing resources. To bridge this gap, encouraged by the parsimonious relationship between genotype and phenotype in the brain's growth and development, we propose the so-called Hyper-Compression that turns the model compression into the issue of parameter representation via a hyperfunction. Specifically, it is known that the trajectory of some low-dimensional dynamic systems can fill the high-dimensional space eventually. Thus, Hyper-Compression, using these dynamic systems as the hyperfunctions, represents the parameters of the target network by their corresponding composition number or trajectory length. This suggests a novel mechanism for model compression, substantially different from the existing pruning, quantization, distillation, and decomposition. Along this direction, we methodologically identify a suitable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmbedded Systems Design Techniques · Algorithms and Data Compression
