Accelerating Fortran Codes: A Method for Integrating Coarray Fortran with CUDA Fortran and OpenMP
James McKevitt, Eduard I. Vorobyov, Igor Kulikov

TL;DR
This paper presents a novel method combining Coarray Fortran, CUDA Fortran, and OpenMP to enhance parallelism and GPU acceleration in Fortran codes, simplifying transition to high-performance computing.
Contribution
The paper introduces an integrated parallel programming approach that fuses CAF, CUDA Fortran, and OpenMP, improving performance and ease of use for Fortran HPC applications.
Findings
CAF achieves similar speeds to MPI with easier implementation.
Method improves transition for legacy codes to parallel computing.
Application to a Poisson solver demonstrates scalability.
Abstract
Fortran's prominence in scientific computing requires strategies to ensure both that legacy codes are efficient on high-performance computing systems, and that the language remains attractive for the development of new high-performance codes. Coarray Fortran (CAF), part of the Fortran 2008 standard introduced for parallel programming, facilitates distributed memory parallelism with a syntax familiar to Fortran programmers, simplifying the transition from single-processor to multi-processor coding. This research focuses on innovating and refining a parallel programming methodology that fuses the strengths of Intel Coarray Fortran, Nvidia CUDA Fortran, and OpenMP for distributed memory parallelism, high-speed GPU acceleration and shared memory parallelism respectively. We consider the management of pageable and pinned memory, CPU-GPU affinity in NUMA multiprocessors, and robust compiler…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
