TREES: A CPU/GPU Task-Parallel Runtime with Explicit Epoch Synchronization
Blake A. Hechtman, Andrew D. Hilton, and Daniel J. Sorin

TL;DR
TREES is a task-parallel runtime system optimized for CPU/GPU platforms, introducing the 'work-together' principle to improve performance by cooperative overhead management, implemented in OpenCL and evaluated experimentally.
Contribution
It presents the TREES runtime system with the novel 'work-together' principle tailored for GPU performance, extending traditional work-first strategies.
Findings
Achieves high performance on CPU/GPU platforms
Introduces the 'work-together' principle for better overhead management
Demonstrates effectiveness through experimental evaluation
Abstract
We have developed a task-parallel runtime system, called TREES, that is designed for high performance on CPU/GPU platforms. On platforms with multiple CPUs, Cilk's "work-first" principle underlies how task-parallel applications can achieve performance, but work-first is a poor fit for GPUs. We build upon work-first to create the "work-together" principle that addresses the specific strengths and weaknesses of GPUs. The work-together principle extends work-first by stating that (a) the overhead on the critical path should be paid by the entire system at once and (b) work overheads should be paid co-operatively. We have implemented the TREES runtime in OpenCL, and we experimentally evaluate TREES applications on a CPU/GPU platform.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Advanced Data Storage Technologies · Distributed and Parallel Computing Systems
