Apple vs. Oranges: Evaluating the Apple Silicon M-Series SoCs for HPC Performance and Efficiency
Paul H\"ubner, Andong Hu, Ivy Peng, Stefano Markidis

TL;DR
This paper evaluates the Apple Silicon M-Series SoCs for high-performance computing, analyzing their architecture, benchmarking their computational and memory performance, and assessing their energy efficiency compared to traditional HPC solutions.
Contribution
It provides a comprehensive performance and efficiency analysis of M-Series chips, highlighting their potential as energy-efficient HPC alternatives and detailing architectural features.
Findings
Up to 100 GB/s memory bandwidth
Up to 2.9 FP32 TFLOPS on M4
Over 200 GFLOPS per Watt efficiency
Abstract
This paper investigates the architectural features and performance potential of the Apple Silicon M-Series SoCs (M1, M2, M3, and M4) for HPC. We provide a detailed review of the CPU and GPU designs, the unified memory architecture, and coprocessors such as Advanced Matrix Extensions (AMX). We design and develop benchmarks in the Metal Shading Language and Objective-C++ to assess FP32 computational and memory performance. We also measure power consumption and efficiency using Apple's powermetrics tool. Our results show that the M-Series chips offer up to 100 GB/s memory bandwidth, and significant generational improvements in computational performance, with up to 2.9 FP32 TFLOPS on the M4. Power consumption varies from a few Watts to 10-20 Watts, with more than 200 GFLOPS per Watt efficiency of GPU and accelerator reached by all four chips. Despite limitations in FP64 support on the GPU,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · 3D IC and TSV technologies · VLSI and Analog Circuit Testing
