Towards an Efficient Combination of Adaptive Routing and Queuing Schemes in Fat-Tree Topologies
Jose Rocher-Gonzalez, Jesus Escudero-Sahuquillo, Pedro J. Garcia,, Francisco J. Quiles, Gaspar Mora

TL;DR
This paper proposes restrictions on multi-path routing in Fat-Tree networks to effectively combine adaptive routing with queue-based congestion management, enhancing network performance in HPC and datacenter systems.
Contribution
It introduces a set of restrictions that improve route selection, enabling better integration of adaptive routing and queuing schemes in Fat-Tree topologies.
Findings
Improved route selection reduces congestion.
Enhanced combination of routing and queuing schemes.
Potential for better network performance in HPC systems.
Abstract
The interconnection network is a key element in High-Performance Computing (HPC) and Datacenter (DC) systems whose performance depends on several design parameters, such as the topology, the switch architecture, and the routing algorithm. Among the most common topologies in HPC systems, the Fat-Tree offers several shortest-path routes between any pair of end-nodes, which allows multi-path routing schemes to balance traffic flows among the available links, thus reducing congestion probability. However, traffic balance cannot solve by itself some congestion situations that may still degrade network performance. Another approach to reduce congestion is queue-based flow separation, but our previous work shows that multi-path routing may spread congested flows across several queues, thus being counterproductive. In this paper, we propose a set of restrictions to improve alternative routes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
