DAG-based Scheduling with Resource Sharing for Multi-task Applications   in a Polyglot GPU Runtime

Alberto Parravicini; Arnaud Delamare; Marco Arnaboldi; Marco D.; Santambrogio

arXiv:2012.09646·cs.DC·January 20, 2021

DAG-based Scheduling with Resource Sharing for Multi-task Applications in a Polyglot GPU Runtime

Alberto Parravicini, Arnaud Delamare, Marco Arnaboldi, Marco D., Santambrogio

PDF

1 Repo

TL;DR

This paper introduces a novel GPU runtime scheduler that enables efficient multi-task execution with resource sharing and overlap, improving performance and ease of use across multiple programming languages.

Contribution

A new GPU scheduler that provides transparent asynchronous execution and resource sharing without prior knowledge of program dependencies, integrated via the GrCUDA polyglot API.

Findings

01

44% average speedup over synchronous execution

02

No slowdown compared to hand-optimized CUDA Graphs code

03

Validated on 6 custom benchmarks for task-parallelism

Abstract

GPUs are readily available in cloud computing and personal devices, but their use for data processing acceleration has been slowed down by their limited integration with common programming languages such as Python or Java. Moreover, using GPUs to their full capabilities requires expert knowledge of asynchronous programming. In this work, we present a novel GPU run time scheduler for multi-task GPU computations that transparently provides asynchronous execution, space-sharing, and transfer-computation overlap without requiring in advance any information about the program dependency structure. We leverage the GrCUDA polyglot API to integrate our scheduler with multiple high-level languages and provide a platform for fast prototyping and easy GPU acceleration. We validate our work on 6 benchmarks created to evaluate task-parallelism and show an average of 44% speedup against synchronous…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

AlbertoParravicini/grcuda
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.