Exploring the acceleration of Nekbone on reconfigurable architectures

Nick Brown

arXiv:2011.04981·cs.DC·November 11, 2020

Exploring the acceleration of Nekbone on reconfigurable architectures

Nick Brown

PDF

TL;DR

This paper demonstrates how redesigning the Nekbone HPC mini-app for FPGAs using high-level synthesis can significantly improve performance and power efficiency compared to CPUs and GPUs, highlighting the potential of reconfigurable architectures.

Contribution

It presents a novel FPGA implementation of Nekbone's AX kernel with optimization strategies, achieving high performance and power efficiency, and compares it against CPU and GPU benchmarks.

Findings

01

FPGA outperforms CPU by 4x in performance

02

FPGA achieves nearly 75% of GPU performance

03

Significant power efficiency gains on FPGA

Abstract

Hardware technological advances are struggling to match scientific ambition, and a key question is how we can use the transistors that we already have more effectively. This is especially true for HPC, where the tendency is often to throw computation at a problem whereas codes themselves are commonly bound, at-least to some extent, by other factors. By redesigning an algorithm and moving from a Von Neumann to dataflow style, then potentially there is more opportunity to address these bottlenecks on reconfigurable architectures, compared to more general-purpose architectures. In this paper we explore the porting of Nekbone's AX kernel, a widely popular HPC mini-app, to FPGAs using High Level Synthesis via Vitis. Whilst computation is an important part of this code, it is also memory bound on CPUs, and a key question is whether one can ameliorate this by leveraging FPGAs. We first…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.