Studying the Practices of Deploying Machine Learning Projects on Docker
Moses Openja, Forough Majidi, Foutse Khomh, Bhagya Chembakottu, Heng, Li

TL;DR
This paper explores how Docker is used to deploy machine learning projects, categorizing project types, purposes, and characteristics of Docker images to understand current practices and challenges.
Contribution
It provides a taxonomy of 21 Docker usage categories for ML projects and analyzes deployment practices, highlighting portability benefits and resource challenges.
Findings
Six categories of ML projects use Docker for deployment.
Docker images mainly aid platform portability across OS, GPU, and languages.
Large image sizes and nested files pose resource challenges.
Abstract
Docker is a containerization service that allows for convenient deployment of websites, databases, applications' APIs, and machine learning (ML) models with a few lines of code. Studies have recently explored the use of Docker for deploying general software projects with no specific focus on how Docker is used to deploy ML-based projects. In this study, we conducted an exploratory study to understand how Docker is being used to deploy ML-based projects. As the initial step, we examined the categories of ML-based projects that use Docker. We then examined why and how these projects use Docker, and the characteristics of the resulting Docker images. Our results indicate that six categories of ML-based projects use Docker for deployment, including ML Applications, MLOps/ AIOps, Toolkits, DL Frameworks, Models, and Documentation. We derived the taxonomy of 21 major categories representing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methodstravel james
