A Preliminary Investigation of MLOps Practices in GitHub

Fabio Calefato; Filippo Lanubile; Luigi Quaranta

arXiv:2209.11453·cs.SE·September 26, 2022

A Preliminary Investigation of MLOps Practices in GitHub

Fabio Calefato, Filippo Lanubile, Luigi Quaranta

PDF

TL;DR

This paper investigates the current state of MLOps practices in open-source GitHub projects, highlighting limited adoption and identifying issues to guide future research in automating ML workflows.

Contribution

It provides an initial analysis of MLOps adoption in GitHub projects, focusing on GitHub Actions and CML, and identifies challenges and areas for improvement.

Findings

01

Limited adoption of MLOps workflows in open-source projects

02

Identification of issues hindering MLOps implementation

03

Guidance for future research directions

Abstract

Background. The rapid and growing popularity of machine learning (ML) applications has led to an increasing interest in MLOps, that is, the practice of continuous integration and deployment (CI/CD) of ML-enabled systems. Aims. Since changes may affect not only the code but also the ML model parameters and the data themselves, the automation of traditional CI/CD needs to be extended to manage model retraining in production. Method. In this paper, we present an initial investigation of the MLOps practices implemented in a set of ML-enabled systems retrieved from GitHub, focusing on GitHub Actions and CML, two solutions to automate the development workflow. Results. Our preliminary results suggest that the adoption of MLOps workflows in open-source GitHub projects is currently rather limited. Conclusions. Issues are also identified, which can guide future research work.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.