APEnet+: a 3D toroidal network enabling Petaflops scale Lattice QCD simulations on commodity clusters
Roberto Ammendola, Andrea Biagioni, Ottorino Frezza, Francesca Lo, Cicero, Alessandro Lonardo, Pier Paolucci, Roberto Petronzio, Davide, Rossetti, Andrea Salamon, Gaetano Salina, Francesco Simula, Nazario Tantalo,, Laura Tosoratto, Piero Vicini

TL;DR
APEnet+ is a scalable 3D toroidal interconnect designed for large GPU clusters, enabling efficient Petaflops-scale Lattice QCD simulations with low latency and high bandwidth.
Contribution
It introduces APEnet+, a new interconnect with hardware support for RDMA and GPU acceleration, scalable to tens of thousands of nodes with linear cost.
Findings
Supports Petaflops-scale Lattice QCD simulations
Provides low latency and high bandwidth networking
Enables scalable GPU cluster performance
Abstract
Many scientific computations need multi-node parallelism for matching up both space (memory) and time (speed) ever-increasing requirements. The use of GPUs as accelerators introduces yet another level of complexity for the programmer and may potentially result in large overheads due to the complex memory hierarchy. Additionally, top-notch problems may easily employ more than a Petaflops of sustained computing power, requiring thousands of GPUs orchestrated with some parallel programming model. Here we describe APEnet+, the new generation of our interconnect, which scales up to tens of thousands of nodes with linear cost, thus improving the price/performance ratio on large clusters. The project target is the development of the Apelink+ host adapter featuring a low latency, high bandwidth direct network, state-of-the-art wire speeds on the links and a PCIe X8 gen2 host interface. It…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
