Managing Software Provenance to Enhance Reproducibility in Computational Research
Akash Dhruv, Anshu Dubey

TL;DR
This paper discusses the importance of managing software provenance to improve reproducibility in computational research, especially in high-performance computing environments, by reviewing documentation practices and tools developed around Flash-X.
Contribution
It introduces a framework for managing software provenance and presents tools and practices to enhance reproducibility in complex HPC scientific experiments.
Findings
Enhanced reproducibility through explicit provenance management.
Development of tools for documenting HPC experiments.
Improved traceability in scientific software workflows.
Abstract
Scientific processes rely on software as an important tool for data acquisition, analysis, and discovery. Over the years sustainable software development practices have made progress in being considered as an integral component of research. However, management of computation-based scientific studies is often left to individual researchers who design their computational experiments based on personal preferences and the nature of the study. We believe that the quality, efficiency, and reproducibility of computation-based scientific research can be improved by explicitly creating an execution environment that allows researchers to provide a clear record of traceability. This is particularly relevant to complex computational studies in high-performance computing (HPC) environments. In this article, we review the documentation required to maintain a comprehensive record of HPC computational…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Advanced Data Storage Technologies · Cloud Computing and Resource Management
