Data-aware Dynamic Execution of Irregular Workloads on Heterogeneous Systems
Zhenyu Bai, Dan Wu, Pranav Dangi, Dhananjaya Wijerathne, Venkata Pavan, Kumar Miriyala, Tulika Mitra

TL;DR
DyPe is a dynamic scheduling framework that automatically optimizes workload distribution on heterogeneous systems with accelerators, significantly improving performance and energy efficiency over static methods.
Contribution
It introduces DyPe, a novel data-aware, multi-objective scheduling approach that dynamically adapts to workload and system characteristics for heterogeneous accelerators.
Findings
DyPe finds optimal schedules in 89.5% of cases, outperforming static scheduling.
Average 1.53x throughput and 1.09x energy efficiency improvements.
Conventional static scheduling is optimal in only 15% of cases.
Abstract
Current approaches to scheduling workloads on heterogeneous systems with specialized accelerators often rely on manual partitioning, offloading tasks with specific compute patterns to accelerators. This method requires extensive experimentation and human effort to identify the tasks suitable for the accelerator. To solve this problem, we introduce DyPe, a scheduling framework tailored for heterogeneous systems with specialized accelerators. Our method automatically partitions, deploys, and reschedules execution when necessary by dynamically analyzing the characteristics of the input data and leveraging the interoperator parallelism among heterogeneous devices. DyPe navigates a multi-objective, multi-constraint design space that considers both system constraints and application requirements, which allows it to discover Pareto-optimal mapping configurations, improving the system's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Distributed and Parallel Computing Systems · Software System Performance and Reliability
