Evaluating the Cost of Atomic Operations on Modern Architectures
Hermann Schweizer, Maciej Besta, Torsten Hoefler

TL;DR
This paper systematically evaluates the latency and bandwidth of atomic operations across various modern architectures, revealing surprising performance characteristics and providing insights for optimizing parallel programming.
Contribution
It introduces a new evaluation methodology, develops a performance model, and provides detailed benchmarks for atomics on multiple architectures, highlighting key performance insights.
Findings
All tested atomics have similar latency and bandwidth despite different consensus numbers.
Atomic operations prevent instruction-level parallelism regardless of dependencies.
Architectural features significantly influence atomic operation performance.
Abstract
Atomic operations (atomics) such as Compare-and-Swap (CAS) or Fetch-and-Add (FAA) are ubiquitous in parallel programming. Yet, performance tradeoffs between these operations and various characteristics of such systems, such as the structure of caches, are unclear and have not been thoroughly analyzed. In this paper we establish an evaluation methodology, develop a performance model, and present a set of detailed benchmarks for latency and bandwidth of different atomics. We consider various state-of-the-art x86 architectures: Intel Haswell, Xeon Phi, Ivy Bridge, and AMD Bulldozer. The results unveil surprising performance relationships between the considered atomics and architectural properties such as the coherence state of the accessed cache lines. One key finding is that all the tested atomics have comparable latency and bandwidth even if they are characterized by different consensus…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Distributed systems and fault tolerance · Radiation Effects in Electronics
