Accelerating the Operation of Complex Workflows through Standard Data Interfaces
Taylor Paul, William Regli

TL;DR
This paper advocates for standardizing data sharing in scientific workflows at the network level to enhance reusability and portability, proposing a reference model, architecture, and open tools for implementation.
Contribution
It introduces a preliminary reference model, architecture, and open tools to evolve workflows from point-to-point connections to shared channels via network services.
Findings
Proposes a network-level data sharing approach for workflows.
Provides initial architecture and open tools for implementation.
Seeks community input for further development.
Abstract
In this position paper we argue for standardizing how we share and process data in scientific workflows at the network-level to maximize step re-use and workflow portability across platforms and networks in pursuit of a foundational workflow stack. We look to evolve workflows from steps connected point-to-point in a directed acyclic graph (DAG) to steps connected via shared channels in a message system implemented as a network service. To start this evolution, we contribute: a preliminary reference model, architecture, and open tools to implement the architecture today. Our goal stands to improve the deployment and operation of complex workflows by decoupling data sharing and data processing in workflow steps. We seek the workflow community's input on this approach's merit, related research to explore and initial requirements from the workflows community to inform future research.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management
