Arax: A Runtime Framework for Decoupling Applications from Heterogeneous Accelerators
Manos Pavlidakis, Stelios Mavridis, Antony Chazapis, Giorgos, Vasiliadis, and Angelos Bilas

TL;DR
Arax is a runtime system that simplifies the use of heterogeneous accelerators by dynamically managing resources, enabling sharing, elasticity, and reducing programming effort with minimal overhead.
Contribution
Arax introduces a dynamic runtime framework that decouples applications from hardware accelerators, supporting resource sharing and elasticity with automatic stub generation.
Findings
Applications run with about 12% overhead using Arax.
Arax improves accelerator sharing, achieving up to 20% faster execution than NVIDIA MPS.
Elasticity support reduces total application turnaround time by up to 2x.
Abstract
Today, using multiple heterogeneous accelerators efficiently from applications and high-level frameworks, such as TensorFlow and Caffe, poses significant challenges in three respects: (a) sharing accelerators, (b) allocating available resources elastically during application execution, and (c) reducing the required programming effort. In this paper, we present Arax, a runtime system that decouples applications from heterogeneous accelerators within a server. First, Arax maps application tasks dynamically to available resources, managing all required task state, memory allocations, and task dependencies. As a result, Arax can share accelerators across applications in a server and adjust the resources used by each application as load fluctuates over time. dditionally, Arax offers a simple API and includes Autotalk, a stub generator that automatically generates stub libraries for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Parallel Computing and Optimization Techniques · Distributed and Parallel Computing Systems
