Oscillation-free Quantization for Low-bit Vision Transformers

Shih-Yang Liu; Zechun Liu; Kwang-Ting Cheng

arXiv:2302.02210·cs.CV·September 20, 2023·6 cites

Oscillation-free Quantization for Low-bit Vision Transformers

Shih-Yang Liu, Zechun Liu, Kwang-Ting Cheng

PDF

Open Access 1 Repo

TL;DR

This paper introduces techniques to eliminate weight oscillation in low-bit quantized vision transformers, significantly improving accuracy and stability during training.

Contribution

It proposes three novel methods—StatsQ, CGA, and QKR—to reduce weight oscillation and enhance quantization robustness in vision transformers.

Findings

01

Achieved up to 9.8% accuracy improvement on ImageNet.

02

Successfully mitigated weight oscillation in low-bit quantization.

03

Outperformed previous state-of-the-art methods by substantial margins.

Abstract

Weight oscillation is an undesirable side effect of quantization-aware training, in which quantized weights frequently jump between two quantized levels, resulting in training instability and a sub-optimal final model. We discover that the learnable scaling factor, a widely-used $de facto$ setting in quantization aggravates weight oscillation. In this study, we investigate the connection between the learnable scaling factor and quantized weight oscillation and use ViT as a case driver to illustrate the findings and remedies. In addition, we also found that the interdependence between quantized weights in $query$ and $key$ of a self-attention layer makes ViT vulnerable to oscillation. We, therefore, propose three techniques accordingly: statistical weight quantization ( $StatsQ$ ) to improve quantization robustness compared to the prevalent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nbasyl/OFQ
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques