HPC Containers for EBRAINS: Towards Portable Cross-Domain Software Environment
Krishna Kant Singh, Eric M\"uller, Eleni Mathioulaki, Wouter Klijn, Lena Oden

TL;DR
This paper presents a method for creating portable HPC container images that enable complex scientific workflows to run efficiently across different HPC sites without site-specific dependencies, ensuring performance parity with bare-metal systems.
Contribution
It introduces a hybrid, PMIx-based containerization strategy using Apptainer that dynamically leverages host hardware, enabling portable MPI- and CUDA-enabled software with maintained performance.
Findings
Container images correctly leverage host hardware and drivers.
Performance of containerized applications matches bare-metal deployments.
Active log analysis helps detect misconfigurations and optimize performance.
Abstract
Deploying complex, distributed scientific workflows across diverse HPC sites is often hindered by site-specific dependencies and complex build environments. This paper investigates the design and performance of portable HPC container images capable of encapsulating MPI- and CUDA-enabled software stacks without sacrificing bare-metal performance. This work is part of recent work performed within the EBRAINS Research Infrastructure, to evaluate the implementation of portable HPC (Apptainer-based) container images targeting the EBRAINS Software Distribution (ESD) -- a Spack-based software ecosystem comprising approximately 80 top-level packages (and 800 dependencies). We evaluate a hybrid, PMIx-based containerization strategy using Apptainer that seamlessly bypasses the need for site-specific builds by dynamically leveraging host-level specialized hardware, such as network interfaces and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Parallel Computing and Optimization Techniques · Cloud Computing and Resource Management
