Reference environments: A universal tool for reproducibility in computational biology
Daniel G. Hurley, Joseph Cursons, Matthew Faria, David M. Budden,, Vijay Rajagopal, Edmund J. Crampin

TL;DR
This paper introduces a universal set of reference environments for computational biology that enable reproducibility across diverse technologies and platforms, promoting open science and decoupling methods from their implementations.
Contribution
It presents a platform-independent approach and tools that facilitate reproducibility across multiple programming languages and technologies in computational biology.
Findings
Demonstrated reproducibility across Python, R, MATLAB, Fortran, C, and Java.
Provided examples from published computational biology papers.
Enabled decoupling of methods from their specific implementations.
Abstract
The drive for reproducibility in the computational sciences has provoked discussion and effort across a broad range of perspectives: technological, legislative/policy, education, and publishing. Discussion on these topics is not new, but the need to adopt standards for reproducibility of claims made based on computational results is now clear to researchers, publishers and policymakers alike. Many technologies exist to support and promote reproduction of computational results: containerisation tools like Docker, literate programming approaches such as Sweave, knitr, iPython or cloud environments like Amazon Web Services. But these technologies are tied to specific programming languages (e.g. Sweave/knitr to R; iPython to Python) or to platforms (e.g. Docker for 64-bit Linux environments only). To date, no single approach is able to span the broad range of technologies and platforms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Single-cell and spatial transcriptomics · Bioinformatics and Genomic Networks
