HG-PIPE: Vision Transformer Acceleration with Hybrid-Grained Pipeline

Qingyu Guo; Jiayong Wan; Songqiang Xu; Meng Li; Yuan Wang

arXiv:2407.17879·cs.AR·August 2, 2024

HG-PIPE: Vision Transformer Acceleration with Hybrid-Grained Pipeline

Qingyu Guo, Jiayong Wan, Songqiang Xu, Meng Li, Yuan Wang

PDF

Open Access

TL;DR

HG-PIPE is a novel FPGA-based Vision Transformer accelerator that employs a hybrid pipeline architecture and approximation techniques to significantly improve throughput and resource efficiency over prior methods.

Contribution

Introduces HG-PIPE, a hybrid-grained pipelined FPGA accelerator for ViT that reduces buffer costs and pipeline bubbles, achieving high throughput and resource efficiency.

Findings

01

2.78x throughput improvement over prior FPGA accelerators

02

7118 images/sec end-to-end ViT processing on VCK190 FPGA

03

2.81x faster than V100 GPU for ViT inference

Abstract

Vision Transformer (ViT) acceleration with field programmable gate array (FPGA) is promising but challenging. Existing FPGA-based ViT accelerators mainly rely on temporal architectures, which process different operators by reusing the same hardware blocks and suffer from extensive memory access overhead. Pipelined architectures, either coarse-grained or fine-grained, unroll the ViT computation spatially for memory access efficiency. However, they usually suffer from significant hardware resource constraints and pipeline bubbles induced by the global computation dependency of ViT. In this paper, we introduce HG-PIPE, a pipelined FPGA accelerator for high-throughput and low-latency ViT processing. HG-PIPE features a hybrid-grained pipeline architecture to reduce on-chip buffer cost and couples the computation dataflow and parallelism design to eliminate the pipeline bubbles. HG-PIPE…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Processing Techniques and Applications · Optical Systems and Laser Technology · CCD and CMOS Imaging Sensors

MethodsAttention Is All You Need · Byte Pair Encoding · Layer Normalization · Label Smoothing · Linear Layer · Softmax · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Multi-Head Attention · Dense Connections