JITA4DS: Disaggregated execution of Data Science Pipelines between the Edge and the Data Centre
Genoveva Vargas-Solar, Ali Akoglu, Md Sahil Hassan

TL;DR
This paper presents JITA-4DS, a cross-layer management system that dynamically configures virtual data centers for data science pipelines by intelligently allocating resources across edge and data center environments.
Contribution
The paper introduces a novel composable architecture and resource management techniques for flexible, just-in-time execution of data science pipelines across heterogeneous infrastructures.
Findings
Effective resource scheduling strategies identified through simulation.
JITA-4DS improves performance and energy efficiency.
Enables customizable virtual data centers for data science workloads.
Abstract
This paper targets the execution of data science (DS) pipelines supported by data processing, transmission and sharing across several resources executing greedy processes. Current data science pipelines environments provide various infrastructure services with computing resources such as general-purpose processors (GPP), Graphics Processing Units (GPUs), Field Programmable Gate Arrays (FPGAs) and Tensor Processing Unit (TPU) coupled with platform and software services to design, run and maintain DS pipelines. These one-fits-all solutions impose the complete externalization of data pipeline tasks. However, some tasks can be executed in the edge, and the backend can provide just in time resources to ensure ad-hoc and elastic execution environments. This paper introduces an innovative composable "Just in Time Architecture" for configuring DCs for Data Science Pipelines (JITA-4DS) and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
