Streaming Data in HPC Workflows Using ADIOS
Greg Eisenhauer, Norbert Podhorszki, Ana Gainaru, Scott Klasky, Philip, E. Davis, Manish Parashar, Matthew Wolf, Eric Suchtya, Erick Fredj, Vicente, Bolea, Franz P\"oschel, Klaus Steiniger, Michael Bussmann, Richard Pausch,, Sunita Chandrasekaran

TL;DR
This paper introduces the Sustainable Staging Transport (SST), an ADIOS engine enabling direct data streaming in HPC workflows to overcome the IO Wall problem, improving performance and usability without requiring source code modifications.
Contribution
The paper presents SST, a novel ADIOS engine that facilitates direct streaming between applications, enhancing data transfer efficiency in HPC workflows without source code changes.
Findings
SST achieves higher bandwidth than filesystem limits in model training workflows.
SST enables strong coupling of applications for multiphysics simulations.
SST supports in situ analysis and visualization with improved data transfer performance.
Abstract
The "IO Wall" problem, in which the gap between computation rate and data access rate grows continuously, poses significant problems to scientific workflows which have traditionally relied upon using the filesystem for intermediate storage between workflow stages. One way to avoid this problem in scientific workflows is to stream data directly from producers to consumers and avoiding storage entirely. However, the manner in which this is accomplished is key to both performance and usability. This paper presents the Sustainable Staging Transport, an approach which allows direct streaming between traditional file writers and readers with few application changes. SST is an ADIOS "engine", accessible via standard ADIOS APIs, and because ADIOS allows engines to be chosen at run-time, many existing file-oriented ADIOS workflows can utilize SST for direct application-to-application…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Advanced Data Storage Technologies · Distributed and Parallel Computing Systems
