ExCP: Extreme LLM Checkpoint Compression via Weight-Momentum Joint Shrinking
Wenshuo Li, Xinghao Chen, Han Shu, Yehui Tang, Yunhe Wang

TL;DR
ExCP introduces a novel framework for extreme compression of large language model checkpoints by leveraging residuals, weight-momentum joint shrinking, and non-uniform quantization, achieving significant storage reduction with minimal performance loss.
Contribution
The paper proposes a new checkpoint compression method combining residual analysis, weight-momentum joint shrinking, and quantization, enabling near-lossless compression of large models.
Findings
Achieves approximately 70x compression on Pythia-410M.
Maintains original model performance on downstream tasks.
Demonstrates effectiveness across models from 410M to 7B parameters.
Abstract
Large language models (LLM) have recently attracted significant attention in the field of artificial intelligence. However, the training process of these models poses significant challenges in terms of computational and storage capacities, thus compressing checkpoints has become an urgent problem. In this paper, we propose a novel Extreme Checkpoint Compression (ExCP) framework, which significantly reduces the required storage of training checkpoints while achieving nearly lossless performance. We first calculate the residuals of adjacent checkpoints to obtain the essential but sparse information for higher compression ratio. To further excavate the redundancy parameters in checkpoints, we then propose a weight-momentum joint shrinking method to utilize another important information during the model optimization, i.e., momentum. In particular, we exploit the information of both model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Advancements in Photolithography Techniques · VLSI and Analog Circuit Testing
MethodsSoftmax · Attention Is All You Need
