The Challenger: When Do New Data Sources Justify Switching Machine Learning Models?

Vassilis Digalakis Jr; Christophe P\'erignon; S\'ebastien Saurin; Flore Sentenac

arXiv:2512.18390·cs.LG·December 23, 2025

The Challenger: When Do New Data Sources Justify Switching Machine Learning Models?

Vassilis Digalakis Jr, Christophe P\'erignon, S\'ebastien Saurin, Flore Sentenac

PDF

Open Access

TL;DR

This paper develops a framework to determine the optimal timing for replacing machine learning models with new data sources, balancing costs and potential gains, and proposes practical algorithms validated on real data.

Contribution

It introduces a unified economic-statistical framework for model switching decisions, derives closed-form solutions, and proposes algorithms with theoretical guarantees.

Findings

01

Optimal switching times depend on cost and learning-curve parameters.

02

The look-ahead sequential method outperforms simpler approaches.

03

Finite-sample guarantees show the method's effectiveness.

Abstract

We study the problem of deciding whether, and when an organization should replace a trained incumbent model with a challenger relying on newly available features. We develop a unified economic and statistical framework that links learning-curve dynamics, data-acquisition and retraining costs, and discounting of future gains. First, we characterize the optimal switching time in stylized settings and derive closed-form expressions that quantify how horizon length, learning-curve curvature, and cost differentials shape the optimal decision. Second, we propose three practical algorithms: a one-shot baseline, a greedy sequential method, and a look-ahead sequential method. Using a real-world credit-scoring dataset with gradually arriving alternative data, we show that (i) optimal switching times vary systematically with cost parameters and learning-curve behavior, and (ii) the look-ahead…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Explainable Artificial Intelligence (XAI) · Stochastic Gradient Optimization Techniques