Reinterpretation and Long-Term Preservation of Data and Code
Stephen Bailey, K.S. Cranmer, Matthew Feickert, Rob Fine, Sabine, Kraml, Clemens Lange

TL;DR
This paper emphasizes the importance of preserving experimental data and code for future scientific reinterpretation, highlighting current challenges and recommending increased dedicated funding and infrastructure development.
Contribution
It provides a comprehensive overview of the state of data and code preservation efforts and offers strategic recommendations for future support and infrastructure.
Findings
Preservation maximizes scientific return and enables future analyses.
Current infrastructure and funding are insufficient for long-term preservation.
Recommendations include increased funding and development of dedicated preservation programs.
Abstract
Careful preservation of experimental data, simulations, analysis products, and theoretical work maximizes their long-term scientific return on investment by enabling new analyses and reinterpretation of the results in the future. Key infrastructure and technical developments needed for some high-value science targets are not in scope for the operations program of the large experiments and are often not effectively funded. Increasingly, the science goals of our projects require contributions that span the boundaries between individual experiments and surveys, and between the theoretical and experimental communities. Furthermore, the computational requirements and technical sophistication of this work is increasing. As a result, it is imperative that the funding agencies create programs that can devote significant resources to these efforts outside of the context of the operations of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Research Data Management Practices · Machine Learning in Materials Science
