DeltaDQ: Ultra-High Delta Compression for Fine-Tuned LLMs via Group-wise Dropout and Separate Quantization
Yanfeng Jiang, Zelan Yang, Bohua Chen, Shen Li, Yong Li, Tao Li

TL;DR
DeltaDQ introduces a novel delta compression framework for fine-tuned LLMs, leveraging group-wise dropout and separate quantization to achieve ultra-high compression ratios with improved accuracy.
Contribution
The paper proposes DeltaDQ, a new distribution-driven delta compression method that significantly enhances compression ratios for fine-tuned LLMs using innovative dropout and quantization techniques.
Findings
Achieves 16x compression with better accuracy than baselines.
Demonstrates 128x and 512x compression ratios on large models.
Exhibits effective compression across different model scales.
Abstract
Large language models achieve exceptional performance on various downstream tasks through supervised fine-tuning. However, the diversity of downstream tasks and practical requirements makes deploying multiple full-parameter fine-tuned models challenging. Current methods that compress the delta weight struggle to achieve ultra-high compression, failing to minimize the deployment overhead. To address the above issue, we propose a novel distribution-driven delta compression framework DeltaDQ, which utilizes Group-wise Dropout and Separate Quantization to achieve ultra-high compression for the delta weight. We have observed that the matrix-computed intermediate results for the delta weight exhibit extremely small variance and min-max range characteristics, referred to as Balanced Intermediate Results. Exploiting this phenomenon, we introduce Group-wise Dropout to perform dropout on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Compression Techniques · Digital Filter Design and Implementation · Medical Imaging Techniques and Applications
MethodsDropout
