Performance portable ice-sheet modeling with MALI
Jerry Watkins, Max Carlson, Kyle Shan, Irina Tezaur, Mauro Perego,, Luca Bertagna, Carolyn Kao, Matthew J. Hoffman, Stephen F. Price

TL;DR
This paper evaluates the performance portability of the MALI ice-sheet modeling code across CPU and GPU supercomputers, demonstrating significant speedups and scalability improvements, and introduces an automated performance testing framework.
Contribution
It presents an analysis of MALI's performance portability features using high-level abstractions and introduces a framework for automated performance testing and optimization.
Findings
Speedups of 1.26-1.82x in key components across architectures
GPU simulations are 1.24-1.92x faster in weak scaling
Finite element assembly achieves up to 8.65x speedup with GPUs
Abstract
High resolution simulations of polar ice-sheets play a crucial role in the ongoing effort to develop more accurate and reliable Earth-system models for probabilistic sea-level projections. These simulations often require a massive amount of memory and computation from large supercomputing clusters to provide sufficient accuracy and resolution. The latest exascale machines poised to come online contain a diverse set of computing architectures. In an effort to avoid architecture specific programming and maintain productivity across platforms, the ice-sheet modeling code known as MALI uses high level abstractions to integrate Trilinos libraries and the Kokkos programming model for performance portable code across a variety of different architectures. In this paper, we analyze the performance portable features of MALI via a performance analysis on current CPU-based and GPU-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Cryospheric studies and observations · Parallel Computing and Optimization Techniques
