OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning
Dan Qiao, Yi Su, Pinzheng Wang, Jing Ye, Wenjing Xie, Yuechi Zhou,, Yuyang Ding, Zecheng Tang, Jikai Wang, Yixin Ji, Yue Wang, Pei Guo, Zechen, Sun, Zikang Zhang, Juntao Li, Pingfu Chao, Wenliang Chen, Guohong Fu, Guodong, Zhou, Qiaoming Zhu, Min Zhang

TL;DR
OpenBA-V2 is a highly compressed 3.4B language model achieved through multi-stage pruning and continual pre-training, maintaining competitive performance with significantly fewer parameters, thus enabling more practical deployment.
Contribution
The paper introduces OpenBA-V2, a novel compressed LLM derived from a 15B model using multi-stage pruning and advanced training techniques, with minimal performance loss.
Findings
Achieved 77.3% compression rate with minimal performance degradation.
OpenBA-V2 performs comparably to the original 15B model on key tasks.
Demonstrates effective compression strategies for resource-efficient LLM deployment.
Abstract
Large Language Models (LLMs) have played an important role in many fields due to their powerful capabilities.However, their massive number of parameters leads to high deployment requirements and incurs significant inference costs, which impedes their practical applications. Training smaller models is an effective way to address this problem. Therefore, we introduce OpenBA-V2, a 3.4B model derived from multi-stage compression and continual pre-training from the original 15B OpenBA model. OpenBA-V2 utilizes more data, more flexible training objectives, and techniques such as layer pruning, neural pruning, and vocabulary pruning to achieve a compression rate of 77.3\% with minimal performance loss. OpenBA-V2 demonstrates competitive performance compared to other open-source models of similar size, achieving results close to or on par with the 15B OpenBA model in downstream tasks such as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Surface Polishing Techniques · Metal Forming Simulation Techniques
MethodsPruning
