Leaf-centric Logical Topology Design for OCS-based GPU Clusters
Xinchi Han, Weihao Jiang, Yingming Mao, Yike Liu, Zhuoran Liu, Yongxi Lv, Peirui Cao, Zhuotao Liu, Ximeng Liu, Xinbing Wang, Changbo Wu, Zihan Zhu, Wu Dongchao, Yang Jian, Zhang Zhanbang, Yuansen Chen, and Shizhen Zhao

TL;DR
This paper proposes a leaf-centric logical topology design for OCS-based GPU clusters to improve ML training throughput by balancing traffic loads and avoiding routing polarization.
Contribution
It introduces a novel leaf-centric paradigm, provides a theoretical condition to prevent routing polarization, and offers an efficient topology design algorithm.
Findings
Achieves up to 19.27% throughput improvement in simulations.
Reduces logical topology computation overhead by 99.16%.
Validates the approach through large-scale simulations.
Abstract
Recent years have witnessed the growing deployment of optical circuit switches (OCS) in commercial GPU clusters (e.g., Google A3 GPU cluster) optimized for machine learning (ML) workloads. Such clusters adopt a three-tier leaf-spine-OCS topology, servers attach to leaf-layer electronic packet switches (EPSes); these leaf switches aggregate into spine-layer EPSes to form a Pod; and multiple Pods are interconnected via core-layer OCSes. Unlike EPSes, OCSes only support circuit-based paths between directly connected spine switches, potentially inducing a phenomenon termed routing polarization, which refers to the scenario where the bandwidth requirements between specific pairs of Pods are unevenly fulfilled through links among different spine switches. The resulting imbalance induces traffic contention and bottlenecks on specific leaf-to-spine links, ultimately reducing ML training…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
