MLDemon: Deployment Monitoring for Machine Learning Systems

Antonio Ginart; Martin Zhang; James Zou

arXiv:2104.13621·cs.LG·February 25, 2022·5 cites

MLDemon: Deployment Monitoring for Machine Learning Systems

Antonio Ginart, Martin Zhang, James Zou

PDF

Open Access

TL;DR

MLDemon is a real-time deployment monitoring system for machine learning models that intelligently decides when to request labels to ensure reliability amid distribution shifts, outperforming existing methods.

Contribution

The paper introduces MLDemon, a novel approach combining unlabeled data and selective labeling for effective real-time model performance monitoring.

Findings

01

Outperforms existing monitoring approaches on diverse datasets.

02

Provides theoretical guarantees of minimax rate optimality.

03

Effectively manages label budget constraints in deployment scenarios.

Abstract

Post-deployment monitoring of ML systems is critical for ensuring reliability, especially as new user inputs can differ from the training distribution. Here we propose a novel approach, MLDemon, for ML DEployment MONitoring. MLDemon integrates both unlabeled data and a small amount of on-demand labels to produce a real-time estimate of the ML model's current performance on a given data stream. Subject to budget constraints, MLDemon decides when to acquire additional, potentially costly, expert supervised labels to verify the model. On temporal datasets with diverse distribution drifts and models, MLDemon outperforms existing approaches. Moreover, we provide theoretical analysis to show that MLDemon is minimax rate optimal for a broad class of distribution drifts.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Stream Mining Techniques · Machine Learning and Data Classification · Advanced Bandit Algorithms Research