P$^2$-ViT: Power-of-Two Post-Training Quantization and Acceleration for Fully Quantized Vision Transformer
Huihong Shi, Xin Cheng, Wendong Mao, and Zhongfeng Wang

TL;DR
P$^2$-ViT introduces a power-of-two post-training quantization framework and specialized hardware acceleration for Vision Transformers, significantly improving efficiency and speed while maintaining accuracy.
Contribution
It is the first to propose power-of-two scaling factors for ViT quantization and a dedicated hardware accelerator, reducing re-quantization overhead and enhancing performance.
Findings
Achieves comparable or better accuracy with PoT scaling.
Up to 10.1x speedup and 36.8x energy savings over GPU.
Higher utilization efficiency compared to state-of-the-art accelerators.
Abstract
Vision Transformers (ViTs) have excelled in computer vision tasks but are memory-consuming and computation-intensive, challenging their deployment on resource-constrained devices. To tackle this limitation, prior works have explored ViT-tailored quantization algorithms but retained floating-point scaling factors, which yield non-negligible re-quantization overhead, limiting ViTs' hardware efficiency and motivating more hardware-friendly solutions. To this end, we propose \emph{P-ViT}, the first \underline{P}ower-of-Two (PoT) \underline{p}ost-training quantization and acceleration framework to accelerate fully quantized ViTs. Specifically, {as for quantization,} we explore a dedicated quantization scheme to effectively quantize ViTs with PoT scaling factors, thus minimizing the re-quantization overhead. Furthermore, we propose coarse-to-fine automatic mixed-precision quantization to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCCD and CMOS Imaging Sensors · Image Processing Techniques and Applications · Advanced Vision and Imaging
