The Feasibility of Implementing Large-Scale Transformers on Multi-FPGA   Platforms

Yu Gao; Juan Camilo Vega; Paul Chow

arXiv:2404.16158·cs.AR·April 26, 2024

The Feasibility of Implementing Large-Scale Transformers on Multi-FPGA Platforms

Yu Gao, Juan Camilo Vega, Paul Chow

PDF

Open Access

TL;DR

This paper investigates the potential of using multiple FPGAs to implement large transformer models, developing a scalable platform and tools, and demonstrating feasibility with a multi-FPGA I-BERT prototype.

Contribution

It introduces a scalable multi-FPGA platform and tools for large ML applications, and validates the approach with a multi-FPGA I-BERT implementation.

Findings

01

Multi-FPGA implementation of I-BERT is feasible.

02

FPGAs can be competitive with GPUs for large ML models.

03

The platform shows promising performance potential.

Abstract

FPGAs are rarely mentioned when discussing the implementation of large machine learning applications, such as Large Language Models (LLMs), in the data center. There has been much evidence showing that single FPGAs can be competitive with GPUs in performance for some computations, especially for low latency, and often much more efficient when power is considered. This suggests that there is merit to exploring the use of multiple FPGAs for large machine learning applications. The challenge with using multiple FPGAs is that there is no commonly-accepted flow for developing and deploying multi-FPGA applications, i.e., there are no tools to describe a large application, map it to multiple FPGAs and then deploy the application on a multi-FPGA platform. In this paper, we explore the feasibility of implementing large transformers using multiple FPGAs by developing a scalable multi-FPGA…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPower Transformer Diagnostics and Insulation · Neural Networks and Applications · Electromagnetic Compatibility and Noise Suppression

MethodsI-BERT