GTaP: A GPU-Resident Fork-Join Task-Parallel Runtime with a Pragma-Based Interface

Yuki Maeda; Kenjiro Taura

arXiv:2604.05982·cs.DC·April 8, 2026

GTaP: A GPU-Resident Fork-Join Task-Parallel Runtime with a Pragma-Based Interface

Yuki Maeda, Kenjiro Taura

PDF

TL;DR

GTaP introduces a GPU-resident runtime supporting fork-join task parallelism with a pragma-based interface, enabling efficient irregular applications on GPUs and outperforming CPU-based OpenMP in many cases.

Contribution

It presents GTaP, a novel GPU runtime with a pragma-based programming model supporting fork-join parallelism and load balancing techniques like work stealing and EPAQ.

Findings

01

GTaP outperforms OpenMP on a 72-core CPU for many irregular workloads.

02

GTaP's design choices outperform naive GPU alternatives.

03

EPAQ improves performance for certain benchmarks, achieving up to 1.8× speedup.

Abstract

Graphics Processing Units (GPUs) excel at regular data-parallel workloads where massive hardware parallelism can be readily exploited. In contrast, many important irregular applications are naturally expressed as task parallelism with a fork-join control structure. While CPU runtimes for fork-join task parallelism are mature, it remains challenging to efficiently support it on GPUs. We propose GTaP, a GPU-resident runtime that supports fork-join task parallelism. GTaP is based on the persistent kernel model, and supports two worker granularities: thread blocks and individual threads. To realize fork-join on GPUs, GTaP represents joins as continuations and executes each task as a state machine that can be split into multiple execution segments. We also extend Clang's frontend with a pragma-based programming model that enables programmers to express fork-join without exposing low-level…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.