Mapping Out the HPC Dependency Chaos
Farid Zakaria, Thomas R. W. Scogland, Todd Gamblin, Carlos Maltzahn

TL;DR
This paper analyzes the complexity of HPC software dependencies, explores packaging mechanisms, and introduces Shrinkwrap, a solution that improves dependency loading efficiency and simplifies binary management in HPC environments.
Contribution
It introduces Shrinkwrap, a novel approach for producing binaries that load dependencies efficiently and accurately, addressing challenges in HPC software stack management.
Findings
Shrinkwrap speeds up dependency loading by up to 7x.
Analysis of packaging mechanisms reveals their benefits and pitfalls.
The approach simplifies HPC software deployment and management.
Abstract
High Performance Computing~(HPC) software stacks have become complex, with the dependencies of some applications numbering in the hundreds. Packaging, distributing, and administering software stacks of that scale is a complex undertaking anywhere. HPC systems deal with esoteric compilers, hardware, and a panoply of uncommon combinations. In this paper, we explore the mechanisms available for packaging software to find its own dependencies in the context of a taxonomy of software distribution, and discuss their benefits and pitfalls. We discuss workarounds for some common problems caused by using these composed stacks and introduce Shrinkwrap: A solution to producing binaries that directly load their dependencies from precise locations and in a precise order. Beyond simplifying the use of the binaries, this approach also speeds up loading as much as 7x for a large dynamically-linked MPI…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Storage Technologies · Distributed and Parallel Computing Systems · Cloud Computing and Resource Management
