Automated Modernization of Machine Learning Engineering Notebooks for Reproducibility

Bihui Jin; Kaiyuan Wang; Pengyu Nie

arXiv:2602.07195·cs.SE·February 10, 2026

Automated Modernization of Machine Learning Engineering Notebooks for Reproducibility

Bihui Jin, Kaiyuan Wang, Pengyu Nie

PDF

Open Access

TL;DR

This paper introduces MLEModernizer, an LLM-based framework that modernizes machine learning notebooks to restore their reproducibility despite evolving hardware and software environments, thereby facilitating reuse and scientific progress.

Contribution

The paper presents MLEModernizer, a novel LLM-driven system that automatically updates notebooks to ensure reproducibility in modern environments, addressing environment erosion issues.

Findings

01

Restored reproducibility in 74.2% of non-reproducible notebooks

02

Environment backporting does not improve reproducibility and causes failures

03

Only 35.4% of Kaggle notebooks remain reproducible today

Abstract

Interactive computational notebooks (e.g., Jupyter notebooks) are widely used in machine learning engineering (MLE) to program and share end-to-end pipelines, from data preparation to model training and evaluation. However, environment erosion-the rapid evolution of hardware and software ecosystems for machine learning-has rendered many published MLE notebooks non-reproducible in contemporary environments, hindering code reuse and scientific progress. To quantify this gap, we study 12,720 notebooks mined from 79 popular Kaggle competitions: only 35.4% remain reproducible today. Crucially, we find that environment backporting, i.e., downgrading dependencies to match the submission time, does not improve reproducibility but rather introduces additional failure modes. To address environment erosion, we design and implement MLEModernizer, an LLM-driven agentic framework that treats the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management · Software Engineering Research · Adversarial Robustness in Machine Learning