Utilizing Provenance in Reusable Research Objects
Zhihao Yuan, Dai Hai Ton That, Siddhant Kothari, Gabriel Fils, Tanu, Malik

TL;DR
This paper explores how provenance information can enhance the sharing and reuse of research objects by enabling accurate reproduction, partial reuse, and modification, supported by new summarization methods.
Contribution
It introduces two provenance summarization techniques that improve understanding and reuse of research objects, facilitating computational reproducibility and partial reuse.
Findings
Provenance-based methods enable accurate reproduction of experiments.
Summarization techniques improve understanding of research object contents.
Algorithms are shown to be effective and efficient through experiments.
Abstract
Science is conducted collaboratively, often requiring the sharing of knowledge about computational experiments. When experiments include only datasets, they can be shared using Uniform Resource Identifiers (URIs) or Digital Object Identifiers (DOIs). An experiment, however, seldom includes only datasets, but more often includes software, its past execution, provenance, and associated documentation. The Research Object has recently emerged as a comprehensive and systematic method for aggregation and identification of diverse elements of computational experiments. While a necessary method, mere aggregation is not sufficient for the sharing of computational experiments. Other users must be able to easily recompute on these shared research objects. Computational provenance is often the key to enable such reuse. In this paper, we show how reusable research objects can utilize provenance to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
