CHOSEN: Compilation to Hardware Optimization Stack for Efficient Vision Transformer Inference
Mohammad Erfan Sadeghi, Arash Fayyazi, Suhas Somashekar, Armin Abdollahi, Massoud Pedram

TL;DR
CHOSEN is a co-design framework that optimizes hardware and software for efficient Vision Transformer inference on FPGAs, significantly improving throughput while addressing computational and memory challenges.
Contribution
It introduces a multi-kernel design, approximate non-linear functions, and an efficient compiler with a novel design space exploration algorithm for FPGA-based ViT deployment.
Findings
Achieves 1.5x and 1.42x throughput improvements on DeiT-S and DeiT-B models.
Effectively balances performance and memory efficiency in FPGA implementations.
Reduces accuracy degradation with approximate non-linear functions.
Abstract
Vision Transformers (ViTs) represent a groundbreaking shift in machine learning approaches to computer vision. Unlike traditional approaches, ViTs employ the self-attention mechanism, which has been widely used in natural language processing, to analyze image patches. Despite their advantages in modeling visual tasks, deploying ViTs on hardware platforms, notably Field-Programmable Gate Arrays (FPGAs), introduces considerable challenges. These challenges stem primarily from the non-linear calculations and high computational and memory demands of ViTs. This paper introduces CHOSEN, a software-hardware co-design framework to address these challenges and offer an automated framework for ViT deployment on the FPGAs in order to maximize performance. Our framework is built upon three fundamental contributions: multi-kernel design to maximize the bandwidth, mainly targeting benefits of multi…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCCD and CMOS Imaging Sensors · Infrared Target Detection Methodologies · Image Processing Techniques and Applications
