Breaking the Compression Ceiling: Data-Free Pipeline for Ultra-Efficient Delta Compression
Xiaohui Wang, Peng Ye, Chenyu Huang, Shenghe Zheng, Bo Zhang, Lei Bai, Wanli Ouyang, Tao Chen

TL;DR
UltraDelta is a novel data-free delta compression pipeline that achieves ultra-high compression ratios while maintaining strong performance across various large models and modalities.
Contribution
It introduces a comprehensive data-free pipeline with variance-based sparsity, distribution-aware compression, and trace-norm rescaling for ultra-efficient delta compression.
Findings
Achieves up to 50x compression on LLaMA-2 models
Outperforms existing methods in ultra-high compression scenarios
Maintains model stability and performance across diverse tasks
Abstract
With the rise of the fine-tuned-pretrained paradigm, storing numerous fine-tuned models for multi-tasking creates significant storage overhead. Delta compression alleviates this by storing only the pretrained model and the highly compressed delta weights (the differences between fine-tuned and pretrained model weights). However, existing methods fail to maintain both high compression and performance, and often rely on data. To address these challenges, we propose UltraDelta, the first data-free delta compression pipeline that achieves both ultra-high compression and strong performance. UltraDelta is designed to minimize redundancy, maximize information, and stabilize performance across inter-layer, intra-layer, and global dimensions, using three key components: (1) Variance-Based Mixed Sparsity Allocation assigns sparsity based on variance, giving lower sparsity to high-variance layers…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Advanced Data Compression Techniques · Numerical Methods and Algorithms
