Optimizing the Weather Research and Forecasting Model with OpenMP   Offload and Codee

Chayanon (Namo) Wichitrnithed; Woo-Sun-Yang; Yun (Helen) He; Brad; Richardson; Koichi Sakaguchi; Manuel Arenaz; William I. Gustafson Jr.; Jacob; Shpund; Ulises Costi Blanco; Alvaro Goldar Dieste

arXiv:2409.07232·cs.DC·September 12, 2024

Optimizing the Weather Research and Forecasting Model with OpenMP Offload and Codee

Chayanon (Namo) Wichitrnithed, Woo-Sun-Yang, Yun (Helen) He, Brad, Richardson, Koichi Sakaguchi, Manuel Arenaz, William I. Gustafson Jr., Jacob, Shpund, Ulises Costi Blanco, Alvaro Goldar Dieste

PDF

Open Access

TL;DR

This paper enhances the Weather Research and Forecasting model by porting key routines to NVIDIA GPUs using OpenMP offloading, achieving over twofold speedup on a supercomputer for weather simulation tasks.

Contribution

It introduces a workflow combining profiling and static analysis to optimize WRF's microphysics routines on GPUs, demonstrating significant performance improvements.

Findings

01

2.08x overall speedup on test case

02

Effective use of OpenMP offloading for WRF

03

Workflow aids in GPU porting and optimization

Abstract

Currently, the Weather Research and Forecasting model (WRF) utilizes shared memory (OpenMP) and distributed memory (MPI) parallelisms. To take advantage of GPU resources on the Perlmutter supercomputer at NERSC, we port parts of the computationally expensive routines of the Fast Spectral Bin Microphysics (FSBM) microphysical scheme to NVIDIA GPUs using OpenMP device offloading directives. To facilitate this process, we explore a workflow for optimization which uses both runtime profilers and a static code inspection tool Codee to refactor the subroutine. We observe a 2.08x overall speedup for the CONUS-12km thunderstorm test case.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Computational Techniques and Applications