FSHMEM: Supporting Partitioned Global Address Space on FPGAs for   Large-Scale Hardware Acceleration Infrastructure

Yashael Faith Arthanto; David Ojika; Joo-Young Kim

arXiv:2207.04625·cs.DC·July 12, 2022

FSHMEM: Supporting Partitioned Global Address Space on FPGAs for Large-Scale Hardware Acceleration Infrastructure

Yashael Faith Arthanto, David Ojika, Joo-Young Kim

PDF

Open Access

TL;DR

This paper introduces FSHMEM, a framework enabling PGAS programming on FPGAs, achieving high bandwidth and low latency, and demonstrating significant speedups in HPC applications with FPGA-based accelerators.

Contribution

It presents the first FPGA implementation of PGAS core functions, integrating GASNet specification, with high performance and compatibility for HPC systems.

Findings

01

Peak bandwidth of 3813 MB/s, over 95% of theoretical maximum.

02

Remote write/read latency of 0.35us/0.59us.

03

Speedup of nearly 2x in matrix multiplication and convolution.

Abstract

By providing highly efficient one-sided communication with globally shared memory space, Partitioned Global Address Space (PGAS) has become one of the most promising parallel computing models in high-performance computing (HPC). Meanwhile, FPGA is getting attention as an alternative compute platform for HPC systems with the benefit of custom computing and design flexibility. However, the exploration of PGAS has not been conducted on FPGAs, unlike the traditional message passing interface. This paper proposes FSHMEM, a software/hardware framework that enables the PGAS programming model on FPGAs. We implement the core functions of GASNet specification on FPGA for native PGAS integration in hardware, while its programming interface is designed to be highly compatible with legacy software. Our experiments show that FSHMEM achieves the peak bandwidth of 3813 MB/s, which is more than 95% of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Interconnection Networks and Systems · Distributed and Parallel Computing Systems