A Contention-Free Model for Converged Kubernetes on HPC
Vanessa Sochat, David Fox, Daniel Milroy

TL;DR
This paper proposes a novel, contention-free model combining HPC workload management with Kubernetes to enable converged computing, enhancing performance and portability for complex scientific workloads.
Contribution
It introduces a new paradigm integrating Flux Framework with Usernetes for seamless HPC and cloud convergence on on-premises clusters.
Findings
HPC application performance remains high with the new model
Network performance between environments is optimized
The setup is reproducible for community use
Abstract
High performance computing (HPC) and cloud have traditionally been separate, and presented in an adversarial light. The conflict arises from disparate beginnings that led to two drastically different cultures, incentive structures, and communities that are now in direct competition with one another for resources, talent, and speed of innovation. With the emergence of converged computing, a new paradigm of computing has entered the space that advocates for bringing together the best of both worlds from a technological and cultural standpoint. This movement has emerged due to economic and practical needs. Emerging heterogeneous, complex scientific workloads that require an orchestration of services, simulation, and reaction to state can no longer be served by traditional HPC paradigms. However, while cloud offers automation, portability, and orchestration, as it stands now it cannot…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputability, Logic, AI Algorithms · Distributed and Parallel Computing Systems
