# Projections of achievable performance for Weather & Climate Dwarfs, and   for entire NWP applications, on hybrid architectures

**Authors:** Micha{\l} Kulczewski, Marek B{\l}a\.zewicz, Sebastian Ciesielski

arXiv: 1908.06098 · 2019-08-20

## TL;DR

This paper presents performance and energy models for Weather & Climate computational patterns, enabling projections of their achievable performance on hybrid CPU-GPU architectures at large scales.

## Contribution

It extends existing performance models to include multi-node GPU-accelerated architectures, facilitating scalable performance and energy predictions for weather and climate simulations.

## Key findings

- Models accurately predict performance on hybrid architectures
- Energy consumption estimates enable energy-efficient planning
- Performance projections support system-scale optimization

## Abstract

This document is one of the deliverable reports created for the ESCAPE project. ESCAPE stands for Energy-efficient Scalable Algorithms for Weather Prediction at Exascale. The project develops world-class, extreme-scale computing capabilities for European operational numerical weather prediction and future climate models. This is done by identifying Weather & Climate dwarfs which are key patterns in terms of computation and communication (in the spirit of the Berkeley dwarfs). These dwarfs are then optimised for different hardware architectures (single and multi-node) and alternative algorithms are explored. Performance portability is addressed through the use of domain specific languages.   This deliverable contains the description of the performance and energy models for the selected Weather & Climate dwarfs for different hardware architectures, multinode with GPU accelerators in particular. Presented performance models are extension to model provided in Deliverable 3.2. With some further enhancements, they are incorporated in the DCworms simulator. In particular, extended models allow to predict computational and energy performance on different architectures: single and multinodes, equipped with CPUs and GPUs accelerators. This allows to provide feasible performance projection at system scale.

---
Source: https://tomesphere.com/paper/1908.06098