Enabling Scientific Workflow Scheduling Research in Non-Uniform Memory Access Architectures
Aurelio Vivas, Harold Castro

TL;DR
This paper introduces nFlows, a system that models, simulates, and executes data-intensive scientific workflows on NUMA-based HPC systems, addressing the challenge of data locality and access latency in complex memory architectures.
Contribution
It presents nFlows, a novel NUMA-aware workflow runtime system that supports modeling, simulation, and execution for optimizing scientific workflows on modern HPC architectures.
Findings
nFlows enables detailed simulation of NUMA effects on workflow scheduling.
The system facilitates the design of NUMA-aware scheduling algorithms.
It helps identify performance bottlenecks related to data locality.
Abstract
Data-intensive scientific workflows increasingly rely on high-performance computing (HPC) systems, complementing traditional Grid and Cloud platforms. However, workflow scheduling on HPC infrastructures remains challenging due to the prevalence of non-uniform memory access (NUMA) architectures. These systems require schedulers to account for data locality not only across distributed environments but also within each node. Modern HPC nodes integrate multiple NUMA domains and heterogeneous memory regions, such as high-bandwidth memory (HBM) and DRAM, and frequently attach accelerators (GPUs or FPGAs) and network interface cards (NICs) to specific NUMA nodes. This design increases the variability of data-access latency and complicates the placement of both tasks and data. Despite these constraints, most workflow scheduling strategies were originally developed for Grid or Cloud environments…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems · Scientific Computing and Data Management · Parallel Computing and Optimization Techniques
