VAQF: Fully Automatic Software-Hardware Co-Design Framework for Low-Bit Vision Transformer
Mengshu Sun, Haoyu Ma, Guoliang Kang, Yifan Jiang, Tianlong Chen,, Xiaolong Ma, Zhangyang Wang, Yanzhi Wang

TL;DR
This paper introduces VAQF, an automatic FPGA-based framework that optimizes quantized Vision Transformers for real-time inference, balancing accuracy and hardware constraints.
Contribution
VAQF is the first fully automatic framework integrating quantization strategy and hardware accelerator design for ViTs on FPGAs.
Findings
Achieves 24 FPS with 8-bit activation quantization.
Meets 30 FPS target with 6-bit activation quantization.
Demonstrates real-time ViT inference on FPGA with minimal compilation time.
Abstract
The transformer architectures with attention mechanisms have obtained success in Nature Language Processing (NLP), and Vision Transformers (ViTs) have recently extended the application domains to various vision tasks. While achieving high performance, ViTs suffer from large model size and high computation complexity that hinders the deployment of them on edge devices. To achieve high throughput on hardware and preserve the model accuracy simultaneously, we propose VAQF, a framework that builds inference accelerators on FPGA platforms for quantized ViTs with binary weights and low-precision activations. Given the model structure and the desired frame rate, VAQF will automatically output the required quantization precision for activations as well as the optimized parameter settings of the accelerator that fulfill the hardware requirements. The implementations are developed with Vivado…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · CCD and CMOS Imaging Sensors · Advanced Neural Network Applications
