An Empirical Evaluation of Modern MLOps Frameworks

Jon Marcos-Mercad\'e; Unai Lopez-Novoa; Mikel Ega\~na Aranguren

arXiv:2601.20415·cs.SE·January 29, 2026

An Empirical Evaluation of Modern MLOps Frameworks

Jon Marcos-Mercad\'e, Unai Lopez-Novoa, Mikel Ega\~na Aranguren

PDF

Open Access

TL;DR

This paper empirically compares popular MLOps frameworks like MLflow, Metaflow, Airflow, and Kubeflow Pipelines across multiple criteria to guide developers in selecting suitable tools for different ML lifecycle tasks.

Contribution

It provides a systematic evaluation of MLOps tools based on practical criteria and offers insights into their suitability for various ML scenarios.

Findings

01

MLflow and Kubeflow excel in ease of installation.

02

Metaflow offers high configuration flexibility.

03

Airflow provides strong interoperability.

Abstract

Given the increasing adoption of AI solutions in professional environments, it is necessary for developers to be able to make informed decisions about the current tool landscape. This work empirically evaluates various MLOps (Machine Learning Operations) tools to facilitate the management of the ML model lifecycle: MLflow, Metaflow, Apache Airflow, and Kubeflow Pipelines. The tools are evaluated by assessing the criteria of Ease of installation, Configuration flexibility, Interoperability, Code instrumentation complexity, result interpretability, and Documentation when implementing two common ML scenarios: Digit classifier with MNIST and Sentiment classifier with IMDB and BERT. The evaluation is completed by providing weighted results that lead to practical conclusions on which tools are best suited for different scenarios.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Scientific Computing and Data Management · Software Engineering Research