Exploiting long vectors with a CFD code: a co-design show case

Marc Blancafort; Roger Ferrer; Guillaume Houzeaux; Marta; Garcia-Gasulla; Filippo Mantovani

arXiv:2411.00815·cs.DC·November 5, 2024

Exploiting long vectors with a CFD code: a co-design show case

Marc Blancafort, Roger Ferrer, Guillaume Houzeaux, Marta, Garcia-Gasulla, Filippo Mantovani

PDF

TL;DR

This paper presents an iterative methodology to optimize vectorization in CFD codes using autovectorization, achieving significant speedups on RISC-V hardware while maintaining portability across architectures.

Contribution

It introduces a detailed, iterative approach to enhance autovectorization efficiency in CFD applications, demonstrating substantial performance gains on RISC-V and portability to other architectures.

Findings

01

Single-core speedup of 7.6× on RISC-V

02

Methodology improves autovectorization efficiency

03

Performance benefits maintained across architectures

Abstract

A current trend in HPC systems is the utilization of architectures with SIMD or vector extensions to exploit data parallelism. There are several ways to take advantage of such modern vector architectures, each with a different impact on the code and its portability. For example, the use of intrinsics, guided vectorization via pragmas, or compiler autovectorization. Our objectives are to maximize vectorization efficiency and minimize code specialization. To achieve these objectives, we rely on compiler autovectorization. We leverage a set of hardware and software tools that allow us to analyze in detail where autovectorization is suboptimal. Thus, we apply an iterative methodology that allows us to incrementally improve the efficient use of the underlying hardware. In this paper, we apply this methodology to a CFD production code. We evaluate the performance on an innovative configurable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.