Specifying and Testing GPU Workgroup Progress Models

Tyler Sorensen; Lucas F. Salvador; Harmit Raval; Hugues Evrard; John; Wickerson; Margaret Martonosi; and Alastair F. Donaldson

arXiv:2109.06132·cs.PL·September 14, 2021

Specifying and Testing GPU Workgroup Progress Models

Tyler Sorensen, Lucas F. Salvador, Harmit Raval, Hugues Evrard, John, Wickerson, Margaret Martonosi, and Alastair F. Donaldson

PDF

TL;DR

This paper develops formal tools and an extensive testing framework to analyze and compare GPU workgroup progress models, revealing significant differences in termination guarantees across various GPU vendors.

Contribution

It introduces a formal language, progress specification, and termination oracle to evaluate GPU progress models through a large set of litmus tests across multiple vendors.

Findings

01

Apple and ARM GPUs do not support the linear occupancy-bound model.

02

Significant variation in termination behavior across GPU vendors.

03

Most GPUs do not conform to previously hypothesized progress models.

Abstract

As GPU availability has increased and programming support has matured, a wider variety of applications are being ported to these platforms. Many parallel applications contain fine-grained synchronization idioms; as such, their correct execution depends on a degree of relative forward progress between threads (or thread groups). Unfortunately, many GPU programming specifications say almost nothing about relative forward progress guarantees between workgroups. Although prior work has proposed a spectrum of plausible progress models for GPUs, cross-vendor specifications have yet to commit to any model. This work is a collection of tools experimental data to aid specification designers when considering forward progress guarantees in programming frameworks. As a foundation, we formalize a small parallel programming language that captures the essence of fine-grained synchronization. We then…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.