P$^2$-ViT: Power-of-Two Post-Training Quantization and Acceleration for   Fully Quantized Vision Transformer

Huihong Shi; Xin Cheng; Wendong Mao; and Zhongfeng Wang

arXiv:2405.19915·cs.AI·May 31, 2024

P$^2$-ViT: Power-of-Two Post-Training Quantization and Acceleration for Fully Quantized Vision Transformer

Huihong Shi, Xin Cheng, Wendong Mao, and Zhongfeng Wang

PDF

Open Access 1 Repo

TL;DR

P$^2$-ViT introduces a power-of-two post-training quantization framework and specialized hardware acceleration for Vision Transformers, significantly improving efficiency and speed while maintaining accuracy.

Contribution

It is the first to propose power-of-two scaling factors for ViT quantization and a dedicated hardware accelerator, reducing re-quantization overhead and enhancing performance.

Findings

01

Achieves comparable or better accuracy with PoT scaling.

02

Up to 10.1x speedup and 36.8x energy savings over GPU.

03

Higher utilization efficiency compared to state-of-the-art accelerators.

Abstract

Vision Transformers (ViTs) have excelled in computer vision tasks but are memory-consuming and computation-intensive, challenging their deployment on resource-constrained devices. To tackle this limitation, prior works have explored ViT-tailored quantization algorithms but retained floating-point scaling factors, which yield non-negligible re-quantization overhead, limiting ViTs' hardware efficiency and motivating more hardware-friendly solutions. To this end, we propose \emph{P $^{2}$ -ViT}, the first \underline{P}ower-of-Two (PoT) \underline{p}ost-training quantization and acceleration framework to accelerate fully quantized ViTs. Specifically, {as for quantization,} we explore a dedicated quantization scheme to effectively quantize ViTs with PoT scaling factors, thus minimizing the re-quantization overhead. Furthermore, we propose coarse-to-fine automatic mixed-precision quantization to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shihuihong214/p2-vit
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCCD and CMOS Imaging Sensors · Image Processing Techniques and Applications · Advanced Vision and Imaging