When to retrain a machine learning model

Regol Florence; Schwinn Leo; Sprague Kyle; Coates Mark; Markovich Thomas

arXiv:2505.14903·cs.LG·May 22, 2025

When to retrain a machine learning model

Regol Florence, Schwinn Leo, Sprague Kyle, Coates Mark, Markovich Thomas

PDF

Open Access

TL;DR

This paper introduces a principled, uncertainty-based approach for determining optimal retraining times of machine learning models amid evolving data, outperforming existing methods across multiple datasets.

Contribution

It proposes a comprehensive formulation of the retraining decision problem and an uncertainty-driven method that forecasts model performance evolution for better timing decisions.

Findings

01

Outperforms existing baselines on 7 datasets

02

Effectively detects when to retrain models under distribution shift

03

Provides a practical solution for real-world model maintenance

Abstract

A significant challenge in maintaining real-world machine learning models is responding to the continuous and unpredictable evolution of data. Most practitioners are faced with the difficult question: when should I retrain or update my machine learning model? This seemingly straightforward problem is particularly challenging for three reasons: 1) decisions must be made based on very limited information - we usually have access to only a few examples, 2) the nature, extent, and impact of the distribution shift are unknown, and 3) it involves specifying a cost ratio between retraining and poor performance, which can be hard to characterize. Existing works address certain aspects of this problem, but none offer a comprehensive solution. Distribution shift detection falls short as it cannot account for the cost trade-off; the scarcity of the data, paired with its unusual structure, makes it…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Stream Mining Techniques · Reinforcement Learning in Robotics · Machine Learning and Data Classification