Utilizing Provenance in Reusable Research Objects

Zhihao Yuan; Dai Hai Ton That; Siddhant Kothari; Gabriel Fils; Tanu; Malik

arXiv:1806.06452·cs.DB·June 19, 2018

Utilizing Provenance in Reusable Research Objects

Zhihao Yuan, Dai Hai Ton That, Siddhant Kothari, Gabriel Fils, Tanu, Malik

PDF

TL;DR

This paper explores how provenance information can enhance the sharing and reuse of research objects by enabling accurate reproduction, partial reuse, and modification, supported by new summarization methods.

Contribution

It introduces two provenance summarization techniques that improve understanding and reuse of research objects, facilitating computational reproducibility and partial reuse.

Findings

01

Provenance-based methods enable accurate reproduction of experiments.

02

Summarization techniques improve understanding of research object contents.

03

Algorithms are shown to be effective and efficient through experiments.

Abstract

Science is conducted collaboratively, often requiring the sharing of knowledge about computational experiments. When experiments include only datasets, they can be shared using Uniform Resource Identifiers (URIs) or Digital Object Identifiers (DOIs). An experiment, however, seldom includes only datasets, but more often includes software, its past execution, provenance, and associated documentation. The Research Object has recently emerged as a comprehensive and systematic method for aggregation and identification of diverse elements of computational experiments. While a necessary method, mere aggregation is not sufficient for the sharing of computational experiments. Other users must be able to easily recompute on these shared research objects. Computational provenance is often the key to enable such reuse. In this paper, we show how reusable research objects can utilize provenance to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.