The AMD Rome Memory Barrier
Phillip Allen Lane, Jessica Lobrano

TL;DR
This paper analyzes the AMD Rome CPU architecture, revealing performance limitations related to memory bandwidth and providing insights into how memory load impacts application performance.
Contribution
It offers new empirical data on AMD Rome's performance bottlenecks related to memory bandwidth and correlates these with application performance metrics.
Findings
Performance drops when memory bandwidth exceeds 37.5 GiB/s for integer workloads.
Performance drops when memory bandwidth exceeds 100 GiB/s for floating-point workloads.
Strong correlation between memory bandwidth and CPI, as well as time-to-completion in benchmarks.
Abstract
With the rapid growth of AMD as a competitor in the CPU industry, it is imperative that high-performance and architectural engineers analyze new AMD CPUs. By understanding new and unfamiliar architectures, engineers are able to adapt their algorithms to fully utilize new hardware. Furthermore, engineers are able to anticipate the limitations of an architecture and determine when an alternate platform is desirable for a particular workload. This paper presents results which show that the AMD "Rome" architecture performance suffers once an application's memory bandwidth exceeds 37.5 GiB/s for integer-heavy applications, or 100 GiB/s for floating-point-heavy workloads. Strong positive correlations between memory bandwidth and CPI are presented, as well as strong positive correlations between increased memory load and time-to-completion of benchmarks from the SPEC CPU2017 benchmark suites.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Distributed and Parallel Computing Systems · Interconnection Networks and Systems
