Docker Does Not Guarantee Reproducibility
Julien Malka, Stefano Zacchiroli, Th\'eo Zimmermann

TL;DR
This paper investigates whether Docker truly guarantees reproducibility by analyzing scientific literature and empirically testing thousands of Docker builds from GitHub to assess their consistency and reliability.
Contribution
It provides a systematic review of Docker's role in reproducibility and an empirical study evaluating the actual reproducibility of Docker images in practice.
Findings
Docker does not guarantee perfect reproducibility in practice.
Many Docker images show inconsistencies when rebuilt from the same Dockerfiles.
Best practices improve reproducibility but do not eliminate all issues.
Abstract
The reproducibility of software environments is a critical concern in modern software engineering, with ramifications ranging from the effectiveness of collaboration workflows to software supply chain security and scientific reproducibility. Containerization technologies like Docker address this problem by encapsulating software environments into shareable filesystem snapshots known as images. While Docker is frequently cited in the literature as a tool that enables reproducibility in theory, the extent of its guarantees and limitations in practice remains under-explored. In this work, we address this gap through two complementary approaches. First, we conduct a systematic literature review to examine how Docker is framed in scientific discourse on reproducibility and to identify documented best practices for writing Dockerfiles enabling reproducible image building. Then, we perform a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Security and Verification in Computing · Software System Performance and Reliability
