Towards Accurate Post-Training Quantization of Vision Transformers via Error Reduction
Yunshan Zhong, You Huang, Jiawei Hu, Yuxin Zhang, Rongrong Ji

TL;DR
This paper introduces ERQ, a two-step post-training quantization method for vision transformers that significantly reduces quantization errors and improves accuracy by addressing activation and weight quantization errors sequentially.
Contribution
ERQ is a novel two-step PTQ approach that combines reparameterization, ridge regression, and iterative rounding refinement to effectively minimize quantization errors in ViTs.
Findings
ERQ outperforms state-of-the-art GPTQ by 36.81% in accuracy on W3A4 ViT-S.
ERQ achieves superior performance across various ViT variants and tasks.
The method effectively reduces quantization errors, leading to higher model accuracy.
Abstract
Post-training quantization (PTQ) for vision transformers (ViTs) has received increasing attention from both academic and industrial communities due to its minimal data needs and high time efficiency. However, many current methods fail to account for the complex interactions between quantized weights and activations, resulting in significant quantization errors and suboptimal performance. This paper presents ERQ, an innovative two-step PTQ method specifically crafted to reduce quantization errors arising from activation and weight quantization sequentially. The first step, Activation quantization error reduction (Aqer), first applies Reparameterization Initialization aimed at mitigating initial quantization errors in high-variance activations. Then, it further mitigates the errors by formulating a Ridge Regression problem, which updates the weights maintained at full-precision using a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInfrared Target Detection Methodologies · CCD and CMOS Imaging Sensors
MethodsSoftmax · Attention Is All You Need
