Towards Accurate Post-Training Quantization of Vision Transformers via   Error Reduction

Yunshan Zhong; You Huang; Jiawei Hu; Yuxin Zhang; Rongrong Ji

arXiv:2407.06794·cs.CV·February 5, 2025·2 cites

Towards Accurate Post-Training Quantization of Vision Transformers via Error Reduction

Yunshan Zhong, You Huang, Jiawei Hu, Yuxin Zhang, Rongrong Ji

PDF

Open Access

TL;DR

This paper introduces ERQ, a two-step post-training quantization method for vision transformers that significantly reduces quantization errors and improves accuracy by addressing activation and weight quantization errors sequentially.

Contribution

ERQ is a novel two-step PTQ approach that combines reparameterization, ridge regression, and iterative rounding refinement to effectively minimize quantization errors in ViTs.

Findings

01

ERQ outperforms state-of-the-art GPTQ by 36.81% in accuracy on W3A4 ViT-S.

02

ERQ achieves superior performance across various ViT variants and tasks.

03

The method effectively reduces quantization errors, leading to higher model accuracy.

Abstract

Post-training quantization (PTQ) for vision transformers (ViTs) has received increasing attention from both academic and industrial communities due to its minimal data needs and high time efficiency. However, many current methods fail to account for the complex interactions between quantized weights and activations, resulting in significant quantization errors and suboptimal performance. This paper presents ERQ, an innovative two-step PTQ method specifically crafted to reduce quantization errors arising from activation and weight quantization sequentially. The first step, Activation quantization error reduction (Aqer), first applies Reparameterization Initialization aimed at mitigating initial quantization errors in high-variance activations. Then, it further mitigates the errors by formulating a Ridge Regression problem, which updates the weights maintained at full-precision using a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInfrared Target Detection Methodologies · CCD and CMOS Imaging Sensors

MethodsSoftmax · Attention Is All You Need