Node-Type-Based Load-Balancing Routing for Parallel Generalized Fat-Trees
John Gliksberg (UCLM, LI-PaRAD), Jean-Noel Quintin, Pedro Javier, Garcia (UCLM)

TL;DR
This paper proposes a node-type-based load-balancing routing algorithm for Parallel Generalized Fat-Trees in HPC clusters, improving performance by addressing node heterogeneity and communication pattern variations.
Contribution
It introduces an extension to existing routing algorithms that balances load among node types, enhancing performance in heterogeneous HPC environments.
Findings
Improved load distribution among node types.
Reduction in performance degradation due to node heterogeneity.
Enhanced routing efficiency in PGFT topologies.
Abstract
High-Performance Computing (HPC) clusters are made up of a variety of node types (usually compute, I/O, service, and GPGPU nodes) and applications don't use nodes of a different type the same way. Resulting communication patterns reflect organization of groups of nodes, and current optimal routing algorithms for all-to-all patterns will not always maximize performance for group-specific communications. Since application communication patterns are rarely available beforehand, we choose to rely on node types as a good guess for node usage. We provide a description of node type heterogeneity and analyse performance degradation caused by unlucky repartition of nodes of the same type. We provide an extension to routing algorithms for Parallel Generalized Fat-Tree topologies (PGFTs) which balances load amongst groups of nodes of the same type. We show how it removes these performance issues…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Interconnection Networks and Systems · Distributed systems and fault tolerance
