Sequential Harmful Shift Detection Without Labels

Salim I. Amoukou; Tom Bewley; Saumitra Mishra; Freddy Lecue; Daniele; Magazzeni; Manuela Veloso

arXiv:2412.12910·stat.ML·December 18, 2024

Sequential Harmful Shift Detection Without Labels

Salim I. Amoukou, Tom Bewley, Saumitra Mishra, Freddy Lecue, Daniele, Magazzeni, Manuela Veloso

PDF

Open Access

TL;DR

This paper presents a new label-free method for detecting harmful distribution shifts in machine learning models during deployment, using a proxy error estimator to identify shifts without ground truth labels.

Contribution

It extends previous label-dependent shift detection methods to operate without labels by employing a trained error estimator as a proxy, enabling practical deployment in real-world scenarios.

Findings

01

High power in detecting various distribution shifts

02

Effective false alarm control across scenarios

03

Works without access to true labels during deployment

Abstract

We introduce a novel approach for detecting distribution shifts that negatively impact the performance of machine learning models in continuous production environments, which requires no access to ground truth data labels. It builds upon the work of Podkopaev and Ramdas [2022], who address scenarios where labels are available for tracking model errors over time. Our solution extends this framework to work in the absence of labels, by employing a proxy for the true error. This proxy is derived using the predictions of a trained error estimator. Experiments show that our method has high power and false alarm control under various distribution shifts, including covariate and label shifts and natural shifts over geography and time.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Control Systems Optimization · Advanced Database Systems and Queries