Enabling OpenMP Task Parallelism on Multi-FPGAs
R. Nepomuceno, R. Sterle, G. Valarini, M. Pereira, H. Yviquel, G., Araujo

TL;DR
This paper extends the OpenMP model to enable task parallelism across multiple interconnected FPGAs, demonstrating near-linear speedups for stencil applications on a 6-FPGA platform.
Contribution
It introduces a method to coordinate multiple FPGAs using OpenMP, facilitating scalable parallelism for large workloads.
Findings
Near-linear speedups achieved with multiple FPGAs
Effective inter-FPGA communication via fiber-optic links
Scalability demonstrated on 6 FPGA system
Abstract
FPGA-based hardware accelerators have received increasing attention mainly due to their ability to accelerate deep pipelined applications, thus resulting in higher computational performance and energy efficiency. Nevertheless, the amount of resources available on even the most powerful FPGA is still not enough to speed up very large modern workloads. To achieve that, FPGAs need to be interconnected in a Multi-FPGA architecture capable of accelerating a single application. However, programming such architecture is a challenging endeavor that still requires additional research. This paper extends the OpenMP task-based computation offloading model to enable a number of FPGAs to work together as a single Multi-FPGA architecture. Experimental results for a set of OpenMP stencil applications running on a Multi-FPGA platform consisting of 6 Xilinx VC709 boards interconnected through…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Interconnection Networks and Systems · Embedded Systems Design Techniques
