The BRAM is the Limit: Shattering Myths, Shaping Standards, and Building   Scalable PIM Accelerators

MD Arafat Kabir; Tendayi Kamucheka; Nathaniel Fredricks; Joel Mandebi,; Jason Bakos; Miaoqing Huang; David Andrews

arXiv:2410.07546·cs.AR·October 11, 2024

The BRAM is the Limit: Shattering Myths, Shaping Standards, and Building Scalable PIM Accelerators

MD Arafat Kabir, Tendayi Kamucheka, Nathaniel Fredricks, Joel Mandebi,, Jason Bakos, Miaoqing Huang, David Andrews

PDF

1 Repo

TL;DR

This paper challenges existing beliefs about FPGA-based PIM accelerators by introducing IMAGine, a design that achieves maximum BRAM clock frequency and scalability, setting new performance standards for GEMV operations.

Contribution

The paper defines a Gold Standard for PIM FPGA designs and demonstrates IMAGine as a practical implementation that surpasses prior PIM accelerators in speed and scalability.

Findings

01

IMAGine clocks at maximum BRAM frequency and scales to 100% of BRAMs.

02

Achieves 2.65x - 3.2x faster clock than existing PIM GEMV engines.

03

Outperforms TPU v1-v2 and Alibaba Hanguang 800 in clock speed.

Abstract

Many recent FPGA-based Processor-in-Memory (PIM) architectures have appeared with promises of impressive levels of parallelism but with performance that falls short of expectations due to reduced maximum clock frequencies, an inability to scale processing elements up to the maximum BRAM capacity, and minimal hardware support for large reduction operations. In this paper, we first establish what we believe should be a "Gold Standard" set of design objectives for PIM-based FPGA designs. This Gold Standard was established to serve as an absolute metric for comparing PIMs developed on different technology nodes and vendor families as well as an aspirational goal for designers. We then present IMAGine, an In-Memory Accelerated GEMV engine used as a case study to show the Gold Standard can be realized in practice. IMAGine serves as an existence proof that dispels several myths surrounding…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

arafat-kabir/imagine
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.