JUWELS Booster -- A Supercomputer for Large-Scale AI Research
Stefan Kesselheim, Andreas Herten, Kai Krajsek, Jan Ebert, Jenia, Jitsev, Mehdi Cherti, Michael Langguth, Bing Gong, Scarlet Stadtler,, Amirpasha Mozaffari, Gabriele Cavallaro, Rocco Sedona, Alexander Schug,, Alexandre Strube, Roshni Kamath, Martin G. Schultz, Morris Riedel

TL;DR
JUWELS Booster is a high-performance supercomputer equipped with numerous GPUs and fast interconnects, designed to facilitate large-scale AI research and applications across various scientific disciplines.
Contribution
The paper introduces the JUWELS Booster system architecture, its capabilities for large-scale AI training, and demonstrates its performance through benchmarks and research highlights.
Findings
Outstanding performance in AI benchmarks
Successful large-scale AI model training
Versatile application across scientific fields
Abstract
In this article, we present JUWELS Booster, a recently commissioned high-performance computing system at the J\"ulich Supercomputing Center. With its system architecture, most importantly its large number of powerful Graphics Processing Units (GPUs) and its fast interconnect via InfiniBand, it is an ideal machine for large-scale Artificial Intelligence (AI) research and applications. We detail its system architecture, parallel, distributed model training, and benchmarks indicating its outstanding performance. We exemplify its potential for research application by presenting large-scale AI research highlights from various scientific fields that require such a facility.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Scientific Computing and Data Management · Domain Adaptation and Few-Shot Learning
