OpTree: An Efficient Algorithm for All-gather Operation in Optical Interconnect Systems
Fei Dai, Yawen Chen, Zhiyi Huang, Haibo Zhang

TL;DR
OpTree is a novel algorithm designed for optical interconnect systems that significantly reduces communication time for All-gather operations by optimizing the communication tree structure.
Contribution
The paper introduces OpTree, an efficient All-gather algorithm tailored for optical interconnects, achieving fewer communication steps and faster data transfer than existing methods.
Findings
OpTree reduces communication steps compared to existing algorithms.
Simulation shows up to 94.30% reduction in communication time.
OpTree achieves optimal communication stages for optical networks.
Abstract
All-gather collective communication is one of the most important communication primitives in parallel and distributed computation, which plays an essential role in many HPC applications such as distributed Deep Learning (DL) with model and hybrid parallelism. To solve the communication bottleneck of All-gather, optical interconnection network can provide unprecedented high bandwidth and reliability for data transfer among the distributed nodes. However, most traditional All-gather algorithms are designed for electrical interconnection, which cannot fit well for optical interconnect systems, resulting in poor performance. This paper proposes an efficient scheme, called OpTree, for All-gather operation on optical interconnect systems. OpTree derives an optimal -ary tree corresponding to the optimal number of communication stages, achieving minimum communication time. We further analyze…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInterconnection Networks and Systems · Advanced Optical Network Technologies · Cloud Computing and Resource Management
