Network Partitioning and Avoidable Contention
Yishai Oltchik (ETH Zurich), Oded Schwartz (Hebrew University of, Jerusalem)

TL;DR
This paper analyzes how network partitioning affects contention in parallel systems, showing that optimizing partition geometry can significantly improve performance and reduce contention.
Contribution
It introduces an edge-isoperimetric analysis method to evaluate and improve network partition geometries for better contention management.
Findings
Partition geometry impacts internal bisection bandwidth.
Adjusting partitions can double performance for contention-bound workloads.
Analysis validated by benchmarking experiments.
Abstract
Network contention frequently dominates the run time of parallel algorithms and limits scaling performance. Most previous studies mitigate or eliminate contention by utilizing one of several approaches: communication-minimizing algorithms; hotspot-avoiding routing schemes; topology-aware task mapping; or improving global network properties, such as bisection bandwidth, edge-expansion, partitioning, and network diameter. In practice, parallel jobs often use only a fraction of a host system. How do processor allocation policies affect contention within a partition? We utilize edge-isoperimetric analysis of network graphs to determine whether a network partition has optimal internal bisection. Increasing the bisection allows a more efficient use of the network resources, decreasing or completely eliminating the link contention. We first study torus networks and characterize partition…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
