DaphneSched: A Scheduler for Integrated Data Analysis Pipelines
Ahmed Eleliemy, Florina M. Ciorba

TL;DR
DaphneSched is a versatile, task-based scheduler designed for integrated data analysis pipelines, improving execution efficiency by up to 13% through multiple scheduling strategies on multicore platforms.
Contribution
Introduces DaphneSched, a novel, versatile scheduler with multiple partitioning and assignment techniques tailored for IDA pipelines, enhancing performance over existing methods.
Findings
Outperforms common scheduling techniques by up to 13%
Supports diverse task partitioning and assignment strategies
Effective on multicore platforms with 20 and 56 cores
Abstract
DAPHNE is a new open-source software infrastructure designed to address the increasing demands of integrated data analysis (IDA) pipelines, comprising data management (DM), high performance computing (HPC), and machine learning (ML) systems. Efficiently executing IDA pipelines is challenging due to their diverse computing characteristics and demands. Therefore, IDA pipelines executed with the DAPHNE infrastructure require an efficient and versatile scheduler to support these demands. This work introduces DaphneSched, the task-based scheduler at the core of DAPHNE. DaphneSched is versatile by incorporating eleven task partitioning and three task assignment techniques, bringing the state-of-the-art closer to the state-of-the-practice task scheduling. To showcase DaphneSched's effectiveness in scheduling IDA pipelines, we evaluate its performance on two applications: a product…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Parallel Computing and Optimization Techniques · Cloud Computing and Resource Management
