Quantization without Tears
Minghao Fu, Hao Yu, Jie Shao, Junjie Zhou, Ke Zhu, Jianxin Wu

TL;DR
QwT introduces a simple, fast, and generalizable network quantization method that enhances accuracy with minimal hyperparameter tuning by adding a lightweight structure, suitable for diverse tasks.
Contribution
The paper presents QwT, a novel quantization approach that incorporates a small linear structure to improve accuracy and simplicity, enabling rapid and versatile model compression.
Findings
Effective across vision, language, and multimodal tasks
Achieves high accuracy with minimal hyperparameter tuning
Provides a closed-form solution for quick improvements
Abstract
Deep neural networks, while achieving remarkable success across diverse tasks, demand significant resources, including computation, GPU memory, bandwidth, storage, and energy. Network quantization, as a standard compression and acceleration technique, reduces storage costs and enables potential inference acceleration by discretizing network weights and activations into a finite set of integer values. However, current quantization methods are often complex and sensitive, requiring extensive task-specific hyperparameters, where even a single misconfiguration can impair model performance, limiting generality across different models and tasks. In this paper, we propose Quantization without Tears (QwT), a method that simultaneously achieves quantization speed, accuracy, simplicity, and generality. The key insight of QwT is to incorporate a lightweight additional structure into the quantized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Data Compression Techniques
MethodsSparse Evolutionary Training
