Recording provenance of workflow runs with RO-Crate
Simone Leo, Michael R. Crusoe, Laura Rodr\'iguez-Navas, Ra\"ul, Sirvent, Alexander Kanitz, Paul De Geest, Rudolf Wittner, Luca Pireddu,, Daniel Garijo, Jos\'e M. Fern\'andez, Iacopo Colonnelli, Matej Gallo, Tazro, Ohta, Hirotaka Suetake, Salvador Capella-Gutierrez, Renske de Wit

TL;DR
This paper introduces Workflow Run RO-Crate, an extension of RO-Crate and Schema.org, to standardize and facilitate interoperable recording of computational workflow provenance across diverse systems, enhancing reproducibility and data sharing.
Contribution
It presents a new model for capturing workflow execution provenance that is interoperable, standards-aligned, and implemented across multiple workflow management systems.
Findings
Supported by an active open community.
Implemented in six workflow systems.
Applied in machine learning image analysis use cases.
Abstract
Recording the provenance of scientific computation results is key to the support of traceability, reproducibility and quality assessment of data products. Several data models have been explored to address this need, providing representations of workflow plans and their executions as well as means of packaging the resulting information for archiving and sharing. However, existing approaches tend to lack interoperable adoption across workflow management systems. In this work we present Workflow Run RO-Crate, an extension of RO-Crate (Research Object Crate) and Schema.org to capture the provenance of the execution of computational workflows at different levels of granularity and bundle together all their associated objects (inputs, outputs, code, etc.). The model is supported by a diverse, open community that runs regular meetings, discussing development, maintenance and adoption aspects.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Cell Image Analysis Techniques · Research Data Management Practices
