Exploring memory synchronization and performance considerations for FPGA platform using the high-abstracted OpenCL framework: Benchmarks development and analysis
Abedalmuhdi Almomany, Amin Jarrah, Muhammed Sutcu

TL;DR
This paper explores how memory access affects performance in FPGA computing using OpenCL benchmarks and proposes efficient implementation strategies.
Contribution
The study introduces eight OpenCL benchmarks for FPGAs and proposes a task-parallel model to reduce synchronization costs.
Findings
Memory access behaviors significantly impact performance in FPGA computing.
A task-parallel model effectively reduces the need for costly synchronization mechanisms.
Tailored implementations of primitives improve performance on FPGA platforms.
Abstract
A key benefit of the Open Computing Language (OpenCL) software framework is its capability to operate across diverse architectures. Field programmable gate arrays (FPGAs) are a high-speed computing architecture used for computation acceleration. This study investigates the impact of memory access time on overall performance in general FPGA computing environments through the creation of eight benchmarks within the OpenCL framework. The developed benchmarks capture a range of memory access behaviors, and they play a crucial role in assessing the performance of spinning and sleeping on FPGA-based architectures. The results obtained guide the formulation of new implementations and contribute to defining an abstraction of FPGAs. This abstraction is then utilized to create tailored implementations of primitives that are well-suited for this platform. While other research endeavors concentrate…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Data Storage Technologies · Distributed and Parallel Computing Systems
