Towards Provenance-Aware Earth Observation Workflows: the openEO Case Study
H. Omidi, L. Sacco, V. Hutter, G. Irsiegler, M. Claus, M. Schobben, A. Jacob, M. Schramm, S. Fiore

TL;DR
This paper introduces an approach to enhance Earth Observation workflows by integrating data provenance tracking within openEO, enabling better understanding of data lineage, dependencies, and transformations in EO processing chains.
Contribution
It presents the integration of yProv4WFs into openEO to improve provenance capture and understanding in EO workflows, which is a novel application of provenance concepts in this context.
Findings
Improved tracking of data lineage in EO workflows.
Enhanced understanding of workflow dependencies and transformations.
Facilitated better reproducibility and transparency in EO data processing.
Abstract
Capturing the history of operations and activities during a computational workflow is significantly important for Earth Observation (EO). The data provenance helps to collect the metadata that records the lineage of data products, providing information about how data are generated, transferred, manipulated, by whom all these operations are performed and through which processes, parameters, and datasets. This paper presents an approach to improve those aspects, by integrating the data provenance library yProv4WFs within openEO, a platform to let users connect to Earth Observation cloud back-ends in a simple and unified way. In addition, it is demonstrated how the integration of data provenance concepts across EO processing chains enables researchers and stakeholders to better understand the flow, the dependencies, and the transformations involved in analytical workflows.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Research Data Management Practices · Geographic Information Systems Studies
