CSQ: Growing Mixed-Precision Quantization Scheme with Bi-level   Continuous Sparsification

Lirui Xiao; Huanrui Yang; Zhen Dong; Kurt Keutzer; Li Du; Shanghang; Zhang

arXiv:2212.02770·cs.CV·March 1, 2023

CSQ: Growing Mixed-Precision Quantization Scheme with Bi-level Continuous Sparsification

Lirui Xiao, Huanrui Yang, Zhen Dong, Kurt Keutzer, Li Du, Shanghang, Zhang

PDF

Open Access

TL;DR

CSQ introduces a stable, fully-differentiable bi-level continuous sparsification method for mixed-precision quantization in DNNs, enabling efficient search for optimal precision schemes with improved accuracy-efficiency tradeoffs.

Contribution

The paper proposes CSQ, a novel bi-level continuous sparsification approach for stable, differentiable mixed-precision quantization scheme search in neural networks.

Findings

01

CSQ outperforms previous methods in efficiency-accuracy tradeoff.

02

It enables dynamic growth and pruning of layer precisions.

03

Experiments validate improved stability and performance across models and datasets.

Abstract

Mixed-precision quantization has been widely applied on deep neural networks (DNNs) as it leads to significantly better efficiency-accuracy tradeoffs compared to uniform quantization. Meanwhile, determining the exact precision of each layer remains challenging. Previous attempts on bit-level regularization and pruning-based dynamic precision adjustment during training suffer from noisy gradients and unstable convergence. In this work, we propose Continuous Sparsification Quantization (CSQ), a bit-level training method to search for mixed-precision quantization schemes with improved stability. CSQ stabilizes the bit-level mixed-precision training process with a bi-level gradual continuous sparsification on both the bit values of the quantized weights and the bit selection in determining the quantization precision of each layer. The continuous sparsification scheme enables…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications

MethodsPruning