Efficient All-to-All Collective Communication Schedules for Direct-Connect Topologies
Prithwish Basu, Liangyu Zhao, Jason Fantl, Siddharth Pal and, Arvind Krishnamurthy, Joud Khoury

TL;DR
This paper develops optimized all-to-all communication schedules for direct-connect supercomputer topologies, addressing algorithmic challenges and proposing a new topology for near-optimal performance in ML and HPC workloads.
Contribution
It introduces a holistic approach to optimize all-to-all communication schedules across various topologies and proposes a novel topology achieving near-optimal performance.
Findings
Developed bandwidth-efficient all-to-all schedules for diverse topologies.
Lowered schedules to multiple runtimes and interconnect technologies.
Proposed a new topology with near-optimal all-to-all performance.
Abstract
The all-to-all collective communications primitive is widely used in machine learning (ML) and high performance computing (HPC) workloads, and optimizing its performance is of interest to both ML and HPC communities. All-to-all is a particularly challenging workload that can severely strain the underlying interconnect bandwidth at scale. This paper takes a holistic approach to optimize the performance of all-to-all collective communications on supercomputer-scale direct-connect interconnects. We address several algorithmic and practical challenges in developing efficient and bandwidth-optimal all-to-all schedules for any topology and lowering the schedules to various runtimes and interconnect technologies. We also propose a novel topology that delivers near-optimal all-to-all performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInterconnection Networks and Systems · Advanced Memory and Neural Computing · Ferroelectric and Negative Capacitance Devices
