Pipeline-Centric Provenance Model
Paul Groth, Ewa Deelman, Gideon Juve, Gaurang Mehta, Bruce Berriman

TL;DR
This paper introduces a pipeline-centric provenance model tailored for workflow-based applications, demonstrating its benefits in reducing storage needs, with a focus on astronomy use cases and general applicability.
Contribution
The paper presents a novel provenance model specifically designed for pipeline workflows, expanding its relevance beyond astronomy and evaluating storage efficiency.
Findings
Reduced storage requirements for provenance data
Effective for astronomy workflow applications
Applicable to a broader class of workflow-based systems
Abstract
In this paper we propose a new provenance model which is tailored to a class of workflow-based applications. We motivate the approach with use cases from the astronomy community. We generalize the class of applications the approach is relevant to and propose a pipeline-centric provenance model. Finally, we evaluate the benefits in terms of storage needed by the approach when applied to an astronomy application.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Distributed and Parallel Computing Systems · Research Data Management Practices
