CAT: Customized Transformer Accelerator Framework on Versal ACAP
Wenbo Zhang, Yiqi Liu, Zhenshan Bao

TL;DR
This paper introduces the CAT framework for customizing Transformer accelerators on Versal ACAP, achieving significant throughput and energy efficiency improvements over GPUs and FPGAs.
Contribution
It presents a novel framework for designing customizable Transformer accelerators on Versal ACAP, bridging hardware flexibility and model requirements.
Findings
Achieves up to 2.41x throughput gain over Nvidia GPU A10G.
Attains up to 7.80x energy efficiency improvement.
Demonstrates effective hardware-model co-optimization for Transformer acceleration.
Abstract
Transformer uses GPU as the initial design platform, but GPU can only perform limited hardware customization. Although FPGA has strong customization ability, the design solution space is huge and the design difficulty is high. Versal ACAP is a heterogeneous computing architecture with AI Engine as the core. It is far more flexible than GPU in hardware customization, and has better and smaller design solution space than traditional FPGA. Therefore, this paper proposes the Customized Transformer Accelerator Framework(CAT), through the CAT framework, a customized Transformer accelerator family can be derived on Versal ACAP, CAT framework has an abstract accelerator architecture design idea, which deconstructs and efficiently maps the Transformer into the hardware, which contains a variety of customizable properties. Through the customization and optimization strategy of the CAT framework,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParticle accelerators and beam dynamics · Non-Destructive Testing Techniques · Magnetic Properties and Applications
