Use Cases of Computational Reproducibility for Scientific Workflows at Exascale
Line Pouchard, Sterling Baldwin, Todd Elsethagen, Carlos Gamboa,, Shantenu Jha, Bibi Raju, Eric Stephan, Li Tang, Kerstin Kleese Van Dam

TL;DR
This paper introduces the ProvEn system to enhance reproducibility in scientific workflows by capturing provenance and performance data, demonstrated through climate simulations and molecular dynamics on HPC platforms.
Contribution
It presents a hybrid queriable system for capturing provenance and performance metrics, improving reproducibility in scientific workflows at exascale.
Findings
ProvEn effectively captures provenance and performance data.
Demonstrated improved reproducibility in climate and molecular dynamics workflows.
System supports querying and relating provenance and performance metrics.
Abstract
We propose an approach for improved reproducibility that includes capturing and relating provenance characteristics and performance metrics, in a hybrid queriable system, the ProvEn server. The system capabilities are illustrated on two use cases: scientific reproducibility of results in the ACME climate simulations and performance reproducibility in molecular dynamics workflows on HPC computing platforms.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Distributed and Parallel Computing Systems · Advanced Data Storage Technologies
