Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision Transformer
Huihong Shi, Haikuo Shao, Wendong Mao, and Zhongfeng Wang

TL;DR
Trio-ViT introduces a Softmax-free efficient Vision Transformer with tailored post-training quantization and dedicated hardware acceleration, significantly improving speed and efficiency for deployment on embedded devices.
Contribution
It proposes a Softmax-free efficient ViT architecture combined with a specialized post-training quantization method and custom hardware accelerator, addressing accuracy and efficiency challenges.
Findings
Achieves up to 3.6x FPS improvement over state-of-the-art ViT accelerators.
Enhances DSP efficiency by up to 2.1x.
Demonstrates effective quantization accuracy with the proposed engine.
Abstract
Motivated by the huge success of Transformers in the field of natural language processing (NLP), Vision Transformers (ViTs) have been rapidly developed and achieved remarkable performance in various computer vision tasks. However, their huge model sizes and intensive computations hinder ViTs' deployment on embedded devices, calling for effective model compression methods, such as quantization. Unfortunately, due to the existence of hardware-unfriendly and quantization-sensitive non-linear operations, particularly {Softmax}, it is non-trivial to completely quantize all operations in ViTs, yielding either significant accuracy drops or non-negligible hardware costs. In response to challenges associated with \textit{standard ViTs}, we focus our attention towards the quantization and acceleration for \textit{efficient ViTs}, which not only eliminate the troublesome Softmax but also integrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCCD and CMOS Imaging Sensors · Infrared Target Detection Methodologies · Image Processing Techniques and Applications
MethodsSoftmax · Focus
