A Workflow-Oriented Framework for Asynchronous Human-AI Collaboration in Hybrid and Compute-Intensive HPC Environments
Sergio Mendoza, Cedric Bhihe, Natalia Zamora, David Modesto, Jose Martin Bugallo Batalla, Jesus Gomez Canovas, Rafel Palomo Avellaneda, and Miguel Perez Espinosa

TL;DR
This paper introduces a workflow framework enabling asynchronous human-AI collaboration across hybrid HPC, cloud, and local systems, improving efficiency and oversight in resource-intensive AI tasks.
Contribution
It presents a novel asynchronous collaboration framework supporting checkpoint-based human input in HPC environments, adaptable to various infrastructures and scheduling systems.
Findings
Framework enables non-blocking human input during HPC workflows
Demonstrated on MareNostrum 5, showing improved efficiency and oversight
Supports SLURM, containerized, and native tasks in hybrid setups
Abstract
Human involvement is critical in training and deploying AI systems in high-stakes defence and security contexts. However, real-time interaction is impractical in HPC environments due to compute intensity and resource constraints. We present a workflow framework that enables asynchronous human-AI collaboration across hybrid infrastructures, including HPC clusters, local machines, and cloud platforms. Workflows can pause at defined checkpoints for human input without halting underlying compute jobs, preventing idle resources and enabling non-blocking supervision. The framework supports interaction with SLURM-based scheduling, containerized and native tasks, and is customized for scenarios requiring human judgment and adaptability. We demonstrate its application in model training on systems like MareNostrum 5, highlighting benefits in portability, efficiency, and oversight in operational…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
