Metadata and provenance management
Ewa Deelman, Bruce Berriman, Ann Chervenak, Oscar Corcho, Paul Groth,, Luc Moreau

TL;DR
This paper discusses the importance of metadata and provenance in scientific data management, emphasizing their roles in data sharing, interpretation, and lifecycle tracking across collaborative research environments.
Contribution
It provides an overview of approaches to metadata and provenance management and illustrates their application in scientific workflows.
Findings
Metadata and provenance are crucial for data sharing and interpretation.
Various approaches exist for managing metadata and provenance.
Applications effectively utilize metadata and provenance in scientific processes.
Abstract
Scientists today collect, analyze, and generate TeraBytes and PetaBytes of data. These data are often shared and further processed and analyzed among collaborators. In order to facilitate sharing and data interpretations, data need to carry with it metadata about how the data was collected or generated, and provenance information about how the data was processed. This chapter describes metadata and provenance in the context of the data lifecycle. It also gives an overview of the approaches to metadata and provenance management, followed by examples of how applications use metadata and provenance in their scientific processes.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
