ML Health: Fitness Tracking for Production Models
Sindhu Ghanta, Sriram Subramanian, Lior Khermosh, Swaminathan, Sundararaman, Harshil Shah, Yakov Goldberg, Drew Roselli, Nisha Talagala

TL;DR
ML Health is a framework designed to monitor and detect performance drops in production machine learning models without relying on labeled data, using diagnostic alerts and distribution mismatch detection to ensure reliability.
Contribution
The paper introduces ML Health, a novel framework for unsupervised monitoring of production ML models, including a new method for detecting data distribution mismatches.
Findings
Our method outperforms standard distance metrics in detecting data mismatches.
ML Health effectively integrates into a full production ML lifecycle.
Automated alerts help prevent catastrophic prediction errors.
Abstract
Deployment of machine learning (ML) algorithms in production for extended periods of time has uncovered new challenges such as monitoring and management of real-time prediction quality of a model in the absence of labels. However, such tracking is imperative to prevent catastrophic business outcomes resulting from incorrect predictions. The scale of these deployments makes manual monitoring prohibitive, making automated techniques to track and raise alerts imperative. We present a framework, ML Health, for tracking potential drops in the predictive performance of ML models in the absence of labels. The framework employs diagnostic methods to generate alerts for further investigation. We develop one such method to monitor potential problems when production data patterns do not match training data distributions. We demonstrate that our method performs better than standard "distance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsContext-Aware Activity Recognition Systems
