Putting Data Science Pipelines on the Edge
Ali Akoglu, Genoveva Vargas-Solar

TL;DR
This paper introduces JITA-4DS, a flexible architecture for data science pipelines that dynamically configures disaggregated data centers to meet changing workload requirements and SLOs.
Contribution
It presents a novel composable architecture and resource management techniques for disaggregated data centers tailored for data science pipelines.
Findings
Demonstrates dynamic assembly of pipeline components based on workload needs.
Models and validates large-scale disaggregated data center performance.
Shows improved SLO adherence through application-aware resource management.
Abstract
This paper proposes a composable "Just in Time Architecture" for Data Science (DS) Pipelines named JITA-4DS and associated resource management techniques for configuring disaggregated data centers (DCs). DCs under our approach are composable based on vertical integration of the application, middleware/operating system, and hardware layers customized dynamically to meet application Service Level Objectives (SLO - application-aware management). Thereby, pipelines utilize a set of flexible building blocks that can be dynamically and automatically assembled and re-assembled to meet the dynamic changes in the workload's SLOs. To assess disaggregated DC's, we study how to model and validate their performance in large-scale settings.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
