A Unifying Framework to Enable Artificial Intelligence in High Performance Computing Workflows
Jens Domke, Mohamed Wahib, Anshu Dubey, Tal Ben-Nun, Erik W. Draeger

TL;DR
This paper proposes a unifying framework designed to seamlessly integrate AI and HPC workflows, addressing the need for scalable, adaptable solutions in scientific computing that can evolve with hardware and software advancements.
Contribution
It introduces a novel framework that unifies HPC and AI/ML workflows, facilitating efficient collaboration and adaptation to new hardware and libraries.
Findings
Framework enables seamless HPC/AI integration
Supports adaptation to new hardware and vendor libraries
Facilitates scalable and efficient scientific workflows
Abstract
Current trends point to a future where large-scale scientific applications are tightly-coupled HPC/AI hybrids. Hence, we urgently need to invest in creating a seamless, scalable framework where HPC and AI/ML can efficiently work together and adapt to novel hardware and vendor libraries without starting from scratch every few years. The current ecosystem and sparsely-connected community are not sufficient to tackle these challenges, and we require a breakthrough catalyst for science similar to what PyTorch enabled for AI.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
