Investigating Applications on the A64FX
Adrian Jackson, Mich\`ele Weiland, Nick Brown, Andrew Turner, Mark, Parsons

TL;DR
This paper evaluates the Fujitsu A64FX processor's performance in HPC and machine learning applications, demonstrating significant performance advantages over other platforms in many benchmarks, despite some variability.
Contribution
It provides a comprehensive benchmarking analysis of the A64FX processor across multiple applications and compares it with existing HPC platforms, highlighting its strengths and limitations.
Findings
A64FX often outperforms other HPC platforms in benchmarks
Performance varies depending on application configuration
No specific optimizations needed for high performance
Abstract
The A64FX processor from Fujitsu, being designed for computational simulation and machine learning applications, has the potential for unprecedented performance in HPC systems. In this paper, we evaluate the A64FX by benchmarking against a range of production HPC platforms that cover a number of processor technologies. We investigate the performance of complex scientific applications across multiple nodes, as well as single node and mini-kernel benchmarks. This paper finds that the performance of the A64FX processor across our chosen benchmarks often significantly exceeds other platforms, even without specific application optimisations for the processor instruction set or hardware. However, this is not true for all the benchmarks we have undertaken. Furthermore, the specific configuration of applications can have an impact on the runtime and performance experienced.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
