ADFQ-ViT: Activation-Distribution-Friendly Post-Training Quantization   for Vision Transformers

Yanfeng Jiang; Ning Sun; Xueshuo Xie; Fei Yang; Tao Li

arXiv:2407.02763·cs.CV·October 15, 2024

ADFQ-ViT: Activation-Distribution-Friendly Post-Training Quantization for Vision Transformers

Yanfeng Jiang, Ning Sun, Xueshuo Xie, Fei Yang, Tao Li

PDF

Open Access

TL;DR

This paper introduces ADFQ-ViT, a novel post-training quantization framework for Vision Transformers that effectively handles activation distribution challenges, significantly improving low-bit quantization accuracy across multiple vision tasks.

Contribution

The paper proposes new quantization methods tailored for ViT activation distributions, including Per-Patch Outlier-aware Quantizer and Shift-Log2 Quantizer, with module-wise optimization for enhanced accuracy.

Findings

01

Achieves 10.23% higher Top-1 accuracy at 4-bit quantization of ViT-B on ImageNet.

02

Significant improvements in image classification, object detection, and segmentation tasks.

03

Outperforms various baseline quantization methods in accuracy at low-bit settings.

Abstract

Vision Transformers (ViTs) have exhibited exceptional performance across diverse computer vision tasks, while their substantial parameter size incurs significantly increased memory and computational demands, impeding effective inference on resource-constrained devices. Quantization has emerged as a promising solution to mitigate these challenges, yet existing methods still suffer from significant accuracy loss at low-bit. We attribute this issue to the distinctive distributions of post-LayerNorm and post-GELU activations within ViTs, rendering conventional hardware-friendly quantizers ineffective, particularly in low-bit scenarios. To address this issue, we propose a novel framework called Activation-Distribution-Friendly post-training Quantization for Vision Transformers, ADFQ-ViT. Concretely, we introduce the Per-Patch Outlier-aware Quantizer to tackle irregular outliers in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCCD and CMOS Imaging Sensors · Infrared Target Detection Methodologies · Advanced Memory and Neural Computing