Efficient Synchronization Primitives for GPUs
Jeff A. Stuart, John D. Owens

TL;DR
This paper develops and analyzes new synchronization primitives for GPUs, optimizing performance by reducing atomic operations and providing guidelines for their use based on GPU memory behavior.
Contribution
It introduces GPU-specific synchronization primitives, benchmarks their performance, and proposes an abstraction model to guide implementation choices and predict future GPU performance.
Findings
Limiting atomic accesses improves synchronization performance.
Different waiting strategies (spin vs sleep) have varying effectiveness based on GPU characteristics.
The proposed primitives outperform existing algorithms on Tesla and Fermi GPUs.
Abstract
In this paper, we revisit the design of synchronization primitives---specifically barriers, mutexes, and semaphores---and how they apply to the GPU. Previous implementations are insufficient due to the discrepancies in hardware and programming model of the GPU and CPU. We create new implementations in CUDA and analyze the performance of spinning on the GPU, as well as a method of sleeping on the GPU, by running a set of memory-system benchmarks on two of the most common GPUs in use, the Tesla- and Fermi-class GPUs from NVIDIA. From our results we define higher-level principles that are valid for generic many-core processors, the most important of which is to limit the number of atomic accesses required for a synchronization operation because atomic accesses are slower than regular memory accesses. We use the results of the benchmarks to critique existing synchronization algorithms and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Interconnection Networks and Systems · Advanced Data Storage Technologies
