Efficient Intra-Rack Resource Disaggregation for HPC Using Co-Packaged DWDM Photonics
George Michelogiannakis, Yehia Arafa, Brandon Cook, Liang Yuan Dai,, Abdel Hameed Badawy, Madeleine Glick, Yuyang Wang, Keren Bergman, John Shalf

TL;DR
This paper presents a photonics-based intra-rack resource disaggregation approach for HPC systems, significantly improving application speed and resource efficiency compared to electronic switch-based systems.
Contribution
It introduces a co-designed photonic disaggregation architecture that meets HPC bandwidth and latency needs, demonstrating substantial performance and resource savings.
Findings
Average application speedup of 11% (up to 46%) for CPU benchmarks.
Speedup of 61% for GPU benchmarks.
Estimated 4x fewer memory modules and 2x fewer NICs needed.
Abstract
The diversity of workload requirements and increasing hardware heterogeneity in emerging high performance computing (HPC) systems motivate resource disaggregation. Resource disaggregation allows compute and memory resources to be allocated individually as required to each workload. However, it is unclear how to efficiently realize this capability and cost-effectively meet the stringent bandwidth and latency requirements of HPC applications. To that end, we describe how modern photonics can be co-designed with modern HPC racks to implement flexible intra-rack resource disaggregation and fully meet the bit error rate (BER) and high escape bandwidth of all chip types in modern HPC racks. Our photonic-based disaggregated rack provides an average application speedup of 11% (46% maximum) for 25 CPU and 61% for 24 GPU benchmarks compared to a similar system that instead uses modern electronic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptical Network Technologies · Neural Networks and Reservoir Computing · Photonic and Optical Devices
