Single-path Bit Sharing for Automatic Loss-aware Model Compression
Jing Liu, Bohan Zhuang, Peng Chen, Chunhua Shen, Jianfei Cai, Mingkui, Tan

TL;DR
This paper introduces Single-path Bit Sharing (SBS), a unified and efficient method for joint network pruning and quantization that automatically determines optimal compression configurations, significantly reducing computational cost while maintaining accuracy.
Contribution
SBS unifies pruning and quantization into a single model with learnable binary gates, enabling automatic configuration search with reduced complexity and cost.
Findings
Achieves 22.6x BOP reduction on MobileNetV2 with 0.1% accuracy drop
Outperforms existing methods in efficiency and compression ratio
Effective on CIFAR-100 and ImageNet datasets
Abstract
Network pruning and quantization are proven to be effective ways for deep model compression. To obtain a highly compact model, most methods first perform network pruning and then conduct network quantization based on the pruned model. However, this strategy may ignore that they would affect each other and thus performing them separately may lead to sub-optimal performance. To address this, performing pruning and quantization jointly is essential. Nevertheless, how to make a trade-off between pruning and quantization is non-trivial. Moreover, existing compression methods often rely on some pre-defined compression configurations. Some attempts have been made to search for optimal configurations, which however may take unbearable optimization cost. To address the above issues, we devise a simple yet effective method named Single-path Bit Sharing (SBS). Specifically, we first consider…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Medical Imaging Techniques and Applications · Advanced Neural Network Applications
MethodsPruning
