A Review of Containerization for Interactive and Reproducible Analysis
Gregory J. Hunt, Johann A. Gagnon-Bartsch

TL;DR
This paper reviews how containerization and code notebooks together improve the sharing, interactivity, and reproducibility of computational analyses in scientific research.
Contribution
It highlights the combined use of containerization and code notebooks as a novel approach to enhance reproducibility and accessibility of computational analyses.
Findings
Containerization fully encapsulates analysis environments.
Code notebooks improve interaction with analyses.
Combined technologies significantly enhance reproducibility.
Abstract
In recent decades the analysis of data has become increasingly computational. Correspondingly, this has changed how scientific and statistical work is shared. For example, it is now commonplace for underlying analysis code and data to be proffered alongside journal publications and conference talks. Unfortunately, sharing code faces several challenges. First, it is often difficult to take code from one computer and run it on another. Code configuration, version, and dependency issues often make this challenging. Secondly, even if the code runs, it is often hard to understand or interact with the analysis. This makes it difficult to assess the code and its findings, for example, in a peer review process. In this review we describe the combination of two computing technologies that help make analyses shareable, interactive, and completely reproducible. These technologies are (1) analysis…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Advanced Data Storage Technologies · Distributed and Parallel Computing Systems
