A64FX -- Your Compiler You Must Decide!

Jens Domke

arXiv:2107.07157·cs.DC·August 3, 2021

A64FX -- Your Compiler You Must Decide!

Jens Domke

PDF

1 Repo

TL;DR

This paper evaluates the performance of different compiler suites on the A64FX CPU used in supercomputers, revealing significant performance gains when deviating from standard usage models.

Contribution

It provides a comparative analysis of compiler performance on A64FX and demonstrates potential optimizations for better HPC performance.

Findings

01

Significant performance improvements achieved through alternative compiler configurations.

02

Compiler choice greatly impacts HPC application efficiency on A64FX.

03

Deviating from recommended usage models can unlock higher performance.

Abstract

The current number one of the TOP500 list, Supercomputer Fugaku, has demonstrated that CPU-only HPC systems aren't dead and CPUs can be used for more than just being the host controller for a discrete accelerators. While the specifications of the chip and overall system architecture, and benchmarks submitted to various lists, like TOP500 and Green500, etc., are clearly highlighting the potential, the proliferation of Arm into the HPC business is rather recent and hence the software stack might not be fully matured and tuned, yet. We test three state-of-the-art compiler suite against a broad set of benchmarks. Our measurements show that orders of magnitudes in performance can be gained by deviating from the recommended usage model of the A64FX compute nodes.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://gitlab.com/domke/a64fxCvC
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.