Poster: Flexible Scheduling of Network and Computing Resources for Distributed AI Tasks
Ruikun Wang, Jiawei Zhang, Qiaolun Zhang, Bojun Zhang, Zhiqun Gu,, Aryanaz Attarpour, Yuefeng Ji, Massimo Tornatore

TL;DR
This paper explores flexible scheduling strategies for distributed AI tasks, aiming to optimize network and computing resource utilization, tested on a programmable testbed, and discusses future challenges and research directions.
Contribution
It introduces new scheduling strategies for distributed AI workloads that enhance communication efficiency and demonstrates their effectiveness on a programmable testbed.
Findings
Improved communication efficiency in distributed AI tasks
Effective scheduling strategies tested on a programmable testbed
Identified key challenges and future research directions
Abstract
Many emerging Artificial Intelligence (AI) applications require on-demand provisioning of large-scale computing, which can only be enabled by leveraging distributed computing services interconnected through networking. To address such increasing demand for networking to serve AI tasks, we investigate new scheduling strategies to improve communication efficiency and test them on a programmable testbed. We also show relevant challenges and research directions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · IoT and Edge/Fog Computing
